Skip to content

Instantly share code, notes, and snippets.

@ncalm
Last active June 15, 2023 09:43
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save ncalm/fbb8551466c472d3d16482af9936795b to your computer and use it in GitHub Desktop.
Save ncalm/fbb8551466c472d3d16482af9936795b to your computer and use it in GitHub Desktop.
This Excel LAMBDA function one-hot encodes a categorical variable
/*
ONEHOT
One-hot encodes a catagorical variable
Inputs
- rng: a single-column array or range of categorical data (usually text)
Returns
For each unique value v in rng, a new column with header rng.header & "_" & v
where a cell in the column is 1 if the row in rng contains the value represented
in the output column, or zero otherwise\
Example
If rng = {"Country";"United States";"United Kingdom";"United States"}
ONEHOT:
{
"Country_United_States","Country_United_Kingdom";
1 ,0;
0 ,1;
1 ,0
}
*/
ONEHOT = LAMBDA(rng,
LET(
var,INDEX(rng,1,1),
vals,UNIQUE(INDEX(rng,2,1):INDEX(rng,ROWS(rng),1)),
heads,var&"_"&TRANSPOSE(SUBSTITUTE(vals," ","_")),
MAKEARRAY(
ROWS(rng),
COLUMNS(heads),
LAMBDA(r,c,
IFS(
r=1,INDEX(heads,1,c),
INDEX(rng,r,1)=INDEX(TRANSPOSE(vals),1,c),1,
TRUE,0
)
)
)
)
);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment