Skip to content

Instantly share code, notes, and snippets.

@jimpea
Last active October 21, 2024 17:57
Show Gist options
  • Save jimpea/4bf717e62d4ac432b23134c59965a8aa to your computer and use it in GitHub Desktop.
Save jimpea/4bf717e62d4ac432b23134c59965a8aa to your computer and use it in GitHub Desktop.
Some usefull Excel Lambda functions
/*
Append two ranges horizontally.
Inputs:
- range1: the first range
- range2: the second range
- default: the value entered into missing rows.
Return: The merged ranges, with empty rows filled with the default value. Missing
value within either of the two ranges filled with zeros (0). The number of rows
equals that of the range with the greatest number of rows and the number of columns
is the sum of the columns in each input range.
Example: Merge the ranges E1:F3 and H1:I2 with cells in the missing row marked as "na".
=APPENDRANGEH(E1:F3,H1:I2,"na")
*/
APPENDRANGEH = LAMBDA(range1, range2, default,
LET(
rows1, ROWS(range1),
rows2, ROWS(range2),
cols1, COLUMNS(range1),
cols2, COLUMNS(range2),
rowindex, SEQUENCE(MAX(rows1, rows2)),
colindex, SEQUENCE(1, cols1 + cols2),
result, IF(
colindex <= cols1,
INDEX(range1, rowindex, colindex),
INDEX(range2, rowindex, colindex - cols1)
),
IFERROR(result, default)
)
);
/*
Append two ranges vertically.
Inputs:
- range1: the first range
- range2: the second range
- default: the value entered into missing rows.
Return: The merged ranges, with empty columns filled with the default value. Missing
values within either of the two ranges filled with zeros (0). The number of columns
equals that of the range with the greatest number of columnss and the number of rows
is the sum of the rows in each input range.
Example: Merge the ranges A1:C2 and A4:B5 with cells in the missing column in the second
range marked as "missing".
=APPENDRANGEV(A1:C2,A4:B5,"missing")
range(A1:C2)
1 2 3
4 5 6
range(A4:B5)
10
30 40
gives the following output:
1 2 3
4 5 6
10 0 missing
30 40 missing
Note the second range has one fewer columns than the first, these positions are marked by
'missing' in the output'. The second range also only has one value in the first row. This
is replaced by a zero(0) in the output.
*/
APPENDRANGEV = LAMBDA(range1, range2, default,
LET(
rows1, ROWS(range1),
rows2, ROWS(range2),
cols1, COLUMNS(range1),
cols2, COLUMNS(range2),
rowindex, SEQUENCE(rows1 + rows2),
colindex, SEQUENCE(1, MAX(cols1, cols2)),
result, IF(
rowindex <= rows1,
INDEX(range1, rowindex, colindex),
INDEX(range2, rowindex - rows1, colindex)
),
IFERROR(result, default)
)
);
/*
Convert an m x n range to an m*n x 1 range
Example: flatten the range in A1#
=FLATTEN(A1#)
inputs:
- array: the cell range to flatten
Return::
- A list of the cells in the array. Empty cells in the range
include as zero (0) in the list.
*/
FLATTEN = LAMBDA(array,
LET(
rows_, ROWS(array),
cols_, COLUMNS(array),
seq_, SEQUENCE(rows_ * cols_, 1, 0, 1),
row_, INT(seq_ / cols_),
col_, MOD(seq_, cols_),
INDEX(array, row_ + 1, col_ + 1)
)
);
/* Example least squares fit to a first order polynomial.
Inputs: Note the input_ and response_ arrays correspond to the 'standard curve'
used to build the linear model. Both ranges must be equal length.
- input_: the input ('x values')
- response_: the response to the input ('y values')
- test_: m x 1 array of test inputs
Return:
- The response of the linear model to the test_ inputs.
*/
LINFIT = LAMBDA(input_, response_, test_,
iferror(
LET(
total_rows_, COUNT(test_),
ones_, SEQUENCE(total_rows_, 1, 1, 0),
coeffs_, LINEST(response_, input_),
INDEX(coeffs_, 1, 2) * ones_ + INDEX(coeffs_, 1, 1) * test_
),
"error: inputs and response must be same size")
);
/*
implementation of Simpson's rule to calculate definite integral
\int_a^b f(x) dx \approx \frac{b - a}{6}\left[f(a) + 4f(\frac{a + b}{2}) + f(b)\right]
Inputs:
- lower: lower limit
- upper: upper limit
- dx: interval
- f: function that maps the array to output
Example: Calculate the area under the curve (x+1)/(x+2), defined by the lambda function,
from the lower, upper and dx in cells O6, O7 and O8.
=SIMPSON(O6,O7,O8,LAMBDA(n, (n + 1)/ (n + 2)))
*/
SIMPSON = LAMBDA(lower,upper,dx,f,
LET(
seq_,SEQUENCE(1+(upper-lower)/dx,1,lower,dx),
REDUCE(
0,
seq_,
LAMBDA(acc,b,
acc +
((dx) / 6) *
(f(b - dx) + 4 * f((2 * b - dx) / 2) + f(b))
)
)));
/*
Remove blanks cells from a single row or column.
Input:
array_list: a list either m x 1 or 1 x n.
Return:
list with blanks removed
*/
NOBLANK = LAMBDA(array_list,
FILTER(array_list,
array_list <> "",
""
)
);
/*
Example factorial implementation using recursion
Inputs:
- n: the integer argument
Return:
- The factorial of the input n
Of course, Excel already has a FACT function, which works with arrays
as the input. This version only works with a single value:
Example, get the factorial of the integer in cell A2:
=jp.MYFACT(A2)
To apply this function to an array in cell A2#, wrap it into a MAP
function:
=LAMBDA(array, MAP(array, LAMBDA(n, jp.MYFACT(n))))(A2#)
*/
MYFACT = LAMBDA(n, IF(n < 2, 1, n * MYFACT(n - 1)));
/* Example fibonacci implementation using recursion
F_0 = 0, F_1 = 1, F_n = F_{n-1} + F_{n-2}
Inputs:
- n1_: The first initial term F_0
- n2_: The second initial term F_1
- c_: the number of times to count up the sequence.
Example: to find the third Fibonacci number starting from 0, 1:
= jp.MYFIB(0,1,n)
As with the MYFACT function, wrap this in a MAP function to work with an array
for instance for an array in cells A4:A11:
=LAMBDA(array, MAP(array, LAMBDA(n, jp.MYFIB(0,1,n))))(A4:A11)
*/
MYFIB = LAMBDA(n1_, n2_, c_,
IF(c_ < 1, n1_ + n2_, MYFIB(n2_, n1_ + n2_, c_ - 1))
);
/*
Implementation of Markov chain
Inputs:
m: Left Stochastic matrix that describes the transitions of a Markov Chain.
v: initial vector in chain
n: length of chain
Return:
The vector in the state reached after n interations
Example:
With a 3 by 3 matrix defined in $B$2:$D$4 and a 3 by 1 vector in $B$7:$B$9,
Using 3 iterations, the following returns a 3 by 1 vector:
=MARKOV($B$2:$D$4, $B$7:$B$9, 3)
*/
MARKOV = Lambda(m, v, n,
if(n=0,
v,
Markov(m, mmult(m, v), n-1)));
// Return a version 4 UUID (aka GUID)
// example from <https://stackoverflow.com/questions/7031347>
// The first character of the third group is always 4 to
// signify a V4 UUID per RFC 4122. The first character of the
// fourth group is always between 8 and b, also per RFC 4122.
UUID = LAMBDA(
LOWER(
CONCATENATE(
DEC2HEX(RANDBETWEEN(0,4294967295),8),
"-",
DEC2HEX(RANDBETWEEN(0,65535),4),
"-",
DEC2HEX(RANDBETWEEN(16384,20479),4),
"-",
DEC2HEX(RANDBETWEEN(32768,49151),4),
"-",
DEC2HEX(RANDBETWEEN(0,65535),4),
DEC2HEX(RANDBETWEEN(0,4294967295),8)
)
)
);
/*
Fold up a list to an array -- for instance a list of 12 values
can fold into a 3 by 4, 4 by 3, 2 by 6, 6 by 2, 1 by 12 or 12 by 1 array.
Inputs:
data - a list of values. This must contain rows * cols elements.
rows - number of rows in the output array
cols - number of cols in the output array
Return:
an array rows x cols
Raises:
#VALUE! error on worksheet if data does not contain rows * cols
elements.
*/
fold = lambda(data, rows, cols,
let(
lvalues_, sequence(rows, cols, 0, 1),
larray_, sequence(rows*cols, 1, 0, 1),
xlookup(lvalues_, larray_, data)
));
/*
The cumulative values of an nx1 array
inputs:
array_, an n x 1 array
return: An nx1 array of cumulative values derived from the input
Example: calculate the cumulative values of the array in cells
A1:A5:
=CUMULATOR(A1:A5)
*/
CUMULATOR = LAMBDA(array_,
LET(
seq_,SEQUENCE(COUNT(array_),1,1,1),
MAP(
seq_,
LAMBDA(n,SUM(INDEX(array_,SEQUENCE(n,1,1,1))))
)
)
);
// mark data with user-defined upper and lower limits
//
// inputs:
// data: numerical array
// lower: the lower limit
// upper: the upper limit
// return:
// array of shape(data), elements set to -1, 0 or 1 according to
// less than lower limit, between limits or more that upper limit
MARKLIMITS = LAMBDA(data,lower, upper,
LET(
mark, 0,
map(data, lambda(x, if(x > upper, 1, if(x < lower, -1, 0))))
)
);
// Mark data with upper and lower limits
// Assume:
// - the data is normally distributed
// - the data is numeric
// - the data is in a contiguous range (array)
// - the data is normally distributed.
// If not normally distributed, set your own limits
// and use the MARKLIMITS formula instead
//
// Depends on the MARKLIMITS formula
//
// Inputs:
// data: numerical array
// alpha: Defines the uppar and lower limit
// Return:
// array of same dimension as input data marked
// 1: over upper limit, 0: between limits, -1: below lower limit
//
// Example:
// `=mark.limits.2T(B18#, D16)`
// data in cell `B18#`, the alpha value in cell `D16`
//
// for instance,
// set alpha = 0.1 to give the 10% - 90% interval
// set alpha = 0.05 to give the 5% - 95% interval
// set alpha = 0.025 to give the 2.5% - 97.5% interval
MARKLIMITS2T = LAMBDA(data,alpha,
LET(
mean, average(data),
zl, norm.inv(alpha, 0, 1),
zu, norm.inv(1 - alpha, 0, 1),
lower_limit, mean + zl * STDEV.S(data),
upper_limit, mean + zu * STDEV.S(data),
MARKLIMITS(data, lower_limit, upper_limit)
//CHOOSE({1,2, 3, 4}, zl, zu, lower_limit, upper_limit)
)
);
//Count the number of repeated values in a list.
//Assumes the list is vertical.
//
// inputs:
// range: a
// output:
// A. n x 2 sorted list of the unique values in the input list and the counts.
//
// Example:
// =count_uniques(tbl_names[initials])
//
UNIQUE_COUNTS = LAMBDA(range,
LET(
uniques, SORT(UNIQUE(range)),
counts, MAP(uniques, LAMBDA(s, SUM(IF(s = range, 1, 0)))),
SORTBY(CHOOSE({1, 2}, uniques, counts), counts, -1)
)
);
// Filter a list by their multiplicity, filtered by a
// limiting value
//
// assumes: the input list is vertical n x 1
//
// inputs:
// range: the input array (n x 1)
// cutoff: The lower limit cutoff
//
// returns:
// n X 2 array from the range giving the multiplicity, ordered
// according to multiplicity.
//
// example:
// List the repleated elements in a list of names, restricting
// to multiplicity above 2:
// `=jp.FILTER_UNIQUE_COUNTS(tbl_names[fname], 2)`
FILTER_UNIQUE_COUNTS = LAMBDA(range, cutoff,
LET(
input_array, UNIQUE_COUNTS(range),
FILTER(input_array, CHOOSECOLS(input_array,2) > cutoff)));
// Output data to given number of significant figures
//
// inputs:
// data: An excel range containing numerical data
// sig_figs: number of sugnificant figures
// return: the data limited to the given number of significant figures
//
// Example:
// Output the data in range 'data_' to the number of significant figures
// given in 'sig_figs'
SIG_FIGS = LAMBDA(data, sig_figs,
ROUND(data, sig_figs - INT(LOG10(ABS(data)))
)
);
newton_raphson = LAMBDA(guess, afun, delta,
/* Newton Raphson method for estimating roots. **Note** Does
not work for complex roots.
args:
guess: user guess for the root
afun: a LAMBDA expression that takes one argument
delta: serves both as the calculation cut-off limlit and
the delta interval to use in calculating the function
derivative
return:
The best guess for the root.
Example:
This expression returns a root for the expression passed as the LAMBDA
expression (-3.23606798)
=newton_raphson(-5, LAMBDA(x, x^2 + 2*x -4), 0.000011)
For the other root (1.23067978):
=newton_raphson(2, LAMBDA(x, x^2 + 2*x -4), 0.000011)
*/
LET(
f1_, afun(guess),
f2_, (afun(guess + delta) - f1_)/delta,
new_guess_, guess - f1_/f2_,
IF(ABS(f1_) < delta, new_guess_, newton_raphson(new_guess_, afun, delta))
)
);
/*
Normalise a matrix
Input:
- M: range of numerical data
Return:
- The data in M normalised to mean:0 and stdev: 1
*/
mnorm = LAMBDA(M,
LET(
means, BYCOL(M, LAMBDA(col, AVERAGE(col))),
stdevs, BYCOL(M, LAMBDA(col, STDEV.S(col))),
(M - means) / stdevs
)
);
/*
Euclidian distance
Input:
- P: a data point from a row of numeric data
Return:
- The Euclidiean distance between the point and the origin
*/
EuclidianD = LAMBDA(P, SQRT(SUM(P ^ 2)));
/*
Unpivot an Excel table
Users may need to collect data directly into an excel range using the intersection
between column and rows to capture information. For instance a microtitre plate
has rows labelled A..H and columns labelled 1..12 with data collected on the (row, column)
pairs.
Inputs:
input: An excel range including the column titles and rows.
Return: The pivoted record putting each (row, column, data) record
On eache line.
Example:
`=unpivot(A3:H6)`
Here, the column headers are in cells A4:A6 and the row headers in cells A4:C4.
Note that the upperleft corner cell is blank. The data is in cells B4:H6. This
gives 21 cells of data -- one for each row, column pair. The output includes a
a header row ["row", "col", "data"]. The user can now select the formula cell
and insert a pivot table
*/
unpivot = LAMBDA(input,
let(
data, take(input, -(ROWS(input)-1), -(COLUMNS(input) - 1)),
data_column_count, COLUMNS(data),
data_row_count, ROWS(data),
row_sequence, SEQUENCE(data_row_count, 1, 1, 1),
col_sequence, SEQUENCE(data_column_count, 1, 1, 1),
row_names, INDEX(input, row_sequence + 1, 1),
col_names, INDEX(input, 1, col_sequence + 1),
data_count, data_column_count * data_row_count,
data_sequence, SEQUENCE(data_count, 1, 0, 1),
stacked, hstack(
index(row_names, int(data_sequence/data_column_count) + 1),
index(col_names, MOD(data_sequence, data_column_count) + 1),
tocol(data)
),
headers, {"row","col","data"},
vstack(headers, stacked)
)
);
/* find common elements in two ranges
Args:
range_a: the first range to test
range_b: the second range to test
Return:
List of common elements
*/
common=LAMBDA(range_a, range_b,
LET(
message, "not_found",
list_a, trim(TOCOL(range_a)),
list_b, trim(TOCOL(range_b)),
mapped, MAP(list_a, LAMBDA(test, FILTER(list_b, list_b = test, message))),
unique(FILTER(mapped, mapped <> message))
)
);
// Interpolate for y between two points (ax, ay), (bx, by) at an intermediate
// value x
// Assumes:
// bx != ax
// by != ay
// bx - ax != 0
interpy = lambda(ax,ay, bx, by, x,
let(
da, bx - ax,
db, by - ay,
slope, db/da,
ay + (x - ax)*slope
)
);
// Interpolate for x between two points (ax, ay), (bx, by) at an intermediate
// value y
// Assumes: same as for interpy
interpx = lambda(ax, ay, bx, by, y,
let(
da, bx - ax,
db, by - ay,
slope, db/da,
ax + (y - ay)/slope
)
);
@ncalm
Copy link

ncalm commented Apr 8, 2022

Really interesting work! Thanks for sharing.

@doilabekho
Copy link

for more precise, the function SIMPSON must be

SIMPSON = LAMBDA(lower,upper,dx,f,
LET(
seq_,SEQUENCE(0+(upper-lower)/dx,1,lower+dx,dx),
REDUCE(
0,
seq_,
LAMBDA(acc,b,
acc +
((dx) / 6) *
(f(b - dx) + 4 * f((2 * b - dx) / 2) + f(b))
)
)));

@Excali78
Copy link

Thanks a lot for sharing.
For the newton_raphson(), it might be useful to enclose the test in an ABS function to ensure we are close to zero and not only inferior to delta:
Instead of
IF(f1_ < delta ...
read
IF(ABS(f1_ < delta)...
Best Regards

@jimpea
Copy link
Author

jimpea commented Jan 12, 2024

Thanks @Excali78 for the feedback, I have updated the gist.

@Excali78
Copy link

Hi jimpea,

You might have a look to Ismail HOSEN video and link to his lambda functions in the comments section, REPEATEACHROWNTIMES() and VSTACKNTIMES(): https://www.youtube.com/watch?v=oP82s9roDuQ

I join you 2 LAMBDAs that might interest your readers, you migth adapt / translate the parameters (now in French) and add them to your repository.

ROUNDTOSUM(myTable, [targetTotal], [RoundingStep]) : rounds a table to a target total with a certain rounding step. It rounds with the greatest remainder method, and in cas of ex-aequos, it prioritises small absolute values (greatest relative errors) :

=LAMBDA(maTable,[totalCible],[pasArrondi],
LET( pas,IF(ISOMITTED(pasArrondi),1,pasArrondi),
somTable,SUM(maTable),
totCib,MROUND(IF(ISOMITTED(totalCible), somTable,totalCible),pas),
nbElements,ROWS(maTable)COLUMNS(maTable),
homothetie,maTable
(totCib/somTable),
arrondiInf,pas*INT(homothetie/pas),
nbPasADistribuer, ROUND((totCib-SUM(arrondiInf))/pas,0),
restesAvecTableEtIndex,HSTACK(TOCOL(homothetie-arrondiInf),TOCOL(maTable),SEQUENCE(nbElements)),
restesTries,SORTBY(restesAvecTableEtIndex,INDEX(restesAvecTableEtIndex,0,1),-1,INDEX(restesAvecTableEtIndex,0,2),1),
restesTriesAvecPas,HSTACK(restesTries,IF(SEQUENCE(nbElements)<=nbPasADistribuer,1,0)),
indicesTriesAvecPas,SORTBY(restesTriesAvecPas,INDEX(restesTriesAvecPas,0,3)),
restesAvecTableEtIndex
))

The 2nd formula creates a rounded table so as to match the rows and columns subtotals of a contingency table (expected values):
PRORATA.RC(rowOfColumnsTotals, columnOfRowsTotals, nDigits)

=LAMBDA(ligneTotauxColonnes,colonneTotauxLignes,nDigits,
LET(total,SUM(ligneTotauxColonnes),
nCol,COLUMNS(ligneTotauxColonnes),
IF(total<>SUM(colonneTotauxLignes),"sous-totaux LC incompatibles entre eux !",
LET(matrice,ligneTotauxColonnes*colonneTotauxLignes/total,
MMULT(ROUND(MMULT(matrice,MINTEG(nCol)),nDigits),MINVERSE(MINTEG(nCol)))))))

Helper function:
MINTEG(n) : returns a matrix of dimension n for computing the integration/cumulative sum of a matrix of n columns. Basically, it returns a triangular matrix with 1's and 0's.
=LAMBDA(n,LET(dim,n,--(SEQUENCE(,dim,1)>=SEQUENCE(dim,,1))))
Hope this helps,
Thierry

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment