Skip to content

Instantly share code, notes, and snippets.

You are a data cleaning assistant. I will give you a list of messy addresses. Standardize them into a clean table with the following columns: street address, city, state, postal code, country
Rules:
- Use appropriate capitalization.
- Always spell out the full name of a state or province. Do not use abbreviations.
- Remove non-city locality descriptors (e.g., downtown, midtown, metro area, greater, borough of, city of). Do not place them in the city or street address.
- If a city token includes directional prefixes/suffixes (SE, NW, North, South), discard those and return only the clean city name; do not attach the markers to the street address.
- If the city is missing and a postal code is available, infer the city from the postal code.
- If the state/province is missing but the street address and city are available, infer the state/province.
- If a state/province is present but the country is missing, infer the country from it (e.g., Ontario → Canada; TX/CO/AL → United States).
@ncalm
ncalm / excel-lambda-describe-v2.txt
Created August 13, 2025 23:01
Functions for grouped descriptive statistics in Excel
// Simple frequency table using COUNTA
FREQTBL = LAMBDA(dimension, GROUPBY(dimension, dimension, COUNTA,,0));
// Alternate version using EXPAND
FREQTBL_alt = LAMBDA(dimension, GROUPBY(dimension,EXPAND(1,ROWS(dimension),,1),COUNT,,0));
DESCRIBE = LAMBDA(data,
LET(arepl, REPLICATE(VSTACK,ROWS(data)),
LAMBDA(statistic,
GROUPBY(arepl(statistic),
@ncalm
ncalm / pq-regex-example-url.txt
Last active June 19, 2025 18:12
Enabling Python and Regular Expressions in Power Query in Excel
Here's an example of how this pyscript app works in the browser:
https://owenprice.pyscriptapps.com/pq-regex/latest/?txt=abc123def456&pattern=%5Cd%2B
To clone the project and modify, create an account on pyscript.com and clone this project:
https://pyscript.com/@owenprice/pq-regex/
@ncalm
ncalm / excel-geography-properties.py
Created March 29, 2025 18:58
Function to extract properties from Geography RichValue type in Python in Excel
def flatten_geography(rich_value):
props = rich_value.data.get("properties", {})
flat = {}
def extract(val):
if not isinstance(val, dict):
return str(val) if val is not None else None
typ = val.get("type")
if typ in ("String", "FormattedNumber"):
def portfolio_analysis(data: UDF.Range, has_dates: bool = True, price_type: str = "Close",
annual_factor: int = 252, custom_weights: list = None, risk_free_rate: float = 0.0) -> UDF.Range:
"""
Calculates key portfolio metrics from stock price data.
Args:
data: Daily stock prices. If has_dates=True, the first column should be dates.
has_dates: Whether the first column contains dates (True) or is price data (False).
price_type: String indicating price type used ("Close" or "Adj Close").
annual_factor: Number of trading days per year, defaulting to 252 for daily data.
@ncalm
ncalm / pyexcel_print_globals.py
Created March 3, 2025 14:36
Print the variables in your global scope in Python in Excel
import inspect
begin_print = False
module_print = True
function_print = True
for key, value in list(globals().items()):
if begin_print and not key.endswith("_print"):
# print imported module names if requested
if module_print and inspect.ismodule(value):
@ncalm
ncalm / excel-lambda-DATE.EXP.txt
Created February 24, 2025 14:44
This Excel LAMBDA function allows us to pass 1, 2 and 3 sequences to the DATE function to produce complex lists of dates
DATE.EXP = LAMBDA(y, m, d,
LET(
mdJoin, CROSSJOIN(m, d),
ymdJoin, CROSSJOIN(y, mdJoin),
yearArray, CHOOSECOLS(ymdJoin, 1),
monthArray, CHOOSECOLS(ymdJoin, 2),
dayArray, CHOOSECOLS(ymdJoin, -1),
SORT(MAP(yearArray, monthArray, dayArray, DATE))
)
);
@ncalm
ncalm / excel-lambda-FILTER.2D.txt
Created February 13, 2025 21:18
This Excel LAMBDA functions enables simultaneous row and column filtering as well as 2D include arrays
IFOMITTED = LAMBDA(arg, then, IF(ISOMITTED(arg),then,arg));
// Collapse a boolean array to a boolean vector
COLLAPSE = LAMBDA(
array,
[collapse_with], // AND (default) or OR
[collapse_to], // 0 (default) = column or 1 = row
IF(collapse_to=0,BYROW,BYCOL)(array,IFOMITTED(collapse_with, AND))
);
Option Explicit
Public Sub TogglePythonCells()
Dim cell As Range
For Each cell In Selection
If Left(cell.Formula2, 3) = "=PY" Then
If cell.HasFormula Then
cell.Formula2 = "'" & cell.Formula2
Else
import numpy as np
import bisect
import timeit
import matplotlib.pyplot as plt
# Initialize list to store results
results = []
# Test for different array sizes
test_values = [1000, 10000, 100000, 1000000, 10000000, 100000000]