Skip to content

Instantly share code, notes, and snippets.

@ncalm
Last active February 17, 2024 12:13
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save ncalm/29c473a940997c3b112bf06e7e71cf15 to your computer and use it in GitHub Desktop.
Save ncalm/29c473a940997c3b112bf06e7e71cf15 to your computer and use it in GitHub Desktop.
This Excel LAMBDA function creates summary tables similar to SQL GROUP BY queries
/*
GROUPAGGREGATE
Creates summary tables of data similar to SQL's GROUP BY queries
Inputs
- dat: a range or array of data with at least two columns,
one of which's control value must be "group" and one not "group"
- control: a single-row array, where COLUMNS(control)=COLUMNS(dat), of values from this list:
group - the values in this column will be output as row headers
textjoin - the values in the corresponding column in dat will be passed to the TEXTJOIN function
sum - the values in the corresponding column in dat will be passed to the SUM function
min - the values in the corresponding column in dat will be passed to the MIN function
max - the values in the corresponding column in dat will be passed to the MAX function
counta - the values in the corresponding column in dat will be passed to the COUNTA function
count - the values in the corresponding column in dat will be passed to the COUNT function
average - the values in the corresponding column in dat will be passed to the AVERAGE function
Returns
An array with COLUMNS(dat) columns and ROWS([unique combinations of grouping values]) rows,
where each non-group cell on a row is the result of the corresponding control function applied
to that column for those groups
Dependencies
RECURSIVEFILTER
https://gist.github.com/ncalm/f1c7b5bd8fe86528da88798b545053c8
*/
GROUPAGGREGATE = LAMBDA(dat,control,
LET(
group_control,control="group",
group_dat,FILTER(dat,group_control),
groups,UNIQUE(group_dat),
group_cols,COLUMNS(groups),
group_col_indices,LET(f,SEQUENCE(1,COUNTA(control))*(group_control),FILTER(f,f<>0)),
val_col_indices,LET(f,SEQUENCE(1,COUNTA(control))*(group_control=FALSE),FILTER(f,f<>0)),
result_arr,MAKEARRAY(
ROWS(groups),
COLUMNS(dat),
LAMBDA(r,c,
LET(
measure_col,INDEX(val_col_indices,1,c-group_cols),
measure,INDEX(RECURSIVEFILTER(dat,group_col_indices,INDEX(groups,r)),,measure_col),
IF(
c<=group_cols,
INDEX(groups,r,c),
CHOOSE(
XMATCH(
INDEX(control,1,measure_col),
{"textjoin",
"sum",
"min",
"max",
"counta",
"count",
"average"}
),
TEXTJOIN(", ",FALSE,SORT(UNIQUE(measure))),
SUM(measure),
MIN(measure),
MAX(measure),
COUNTA(measure),
COUNT(measure),
AVERAGE(measure)
)
)
)
)
),
result_arr
)
);
@ExcelRobot
Copy link

I love this LAMBDA function! I have one bug fix for you and some suggestions to make it even better:

One issue I found is that if column you are grouping by only has one unique value, the calculated columns return #REF!. The fix is to add a second comma in the INDEX on row 45: INDEX(groups,r,) instead of INDEX(groups,r).

Some suggestions:

  1. Consider adding a "skip" option to the control array so that if I pass it a table with four columns but I only want two of them back, I can pass {"group","skip","skip","sum"}.
  2. Also consider adding a "blank" option that returns just a blank column in the place of one of your data columns. This is handy when you are stacking different combinations of groupings on top of each other and you still want the columns to line up.
  3. In the case where the columns you want to sum up are not to the right of the columns you want to group by, the returned results are not coming back in the same order as they were passed in. Not sure if that is a feature or a bug, but it would be great to have the option to return the columns in the same order.
  4. Sometimes tables have a header row included. It would be awesome if there was an optional parameter "has_header" (default: false), that you can use to include a header on the result.
  5. And if you want to make it recursive, include an "include_total_row" optional parameter (default: false), that will call itself to calculate a grand total row for itself and VSTACK it to the bottom.

Cheers!

@ncalm
Copy link
Author

ncalm commented Aug 1, 2022

Thanks so much for this! I've been meaning to come back to this and rewrite it with some of the things I've learned since I originally wrote it. I will definitely incorporate your feedback when I do that.

@paulfelce
Copy link

This is an absolutely beautiful piece of work. I will be using it heavily until Group by is rolled out!

@ExcelRobot , can you not use choosecolumns to grab the cols you want?(for suggestion 1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment