Skip to content

Instantly share code, notes, and snippets.

View kmatt's full-sized avatar
😐

Matt Keranen kmatt

😐
  • SE US
View GitHub Profile
@kmatt
kmatt / build-spark-pip.sh
Created September 1, 2023 21:13
Build Spark for Python Pip
#!/bin/bash
# build/build-spark-pip.sh
# https://spark.apache.org/docs/3.4.1/building-spark.html
export MAVEN_OPTS="-Xss64m -Xmx2g -XX:ReservedCodeCacheSize=1g"
#./build/mvn -DskipTests clean package
pushd ..
@kmatt
kmatt / dm_execs.sql
Last active August 29, 2023 01:59
SQL Server execution status
SELECT R.session_id, DatabaseName = db_name(R.database_id), S.text, R.Status, R.Command,
R.percent_complete,
CAST(CAST(DATEADD(mi, R.total_elapsed_time/60000.0, 0) AS TIME) AS CHAR(5)) runTime,
CAST(CAST(DATEADD(mi, R.estimated_completion_time/60000.0, 0) AS TIME) AS CHAR(5)) remain,
R.reads, R.writes, R.cpu_time, R.status, R.wait_type
FROM sys.dm_exec_requests R
CROSS APPLY sys.dm_exec_sql_text(R.sql_handle) S
WHERE R.session_id <> @@SPID
@kmatt
kmatt / duckssh.py
Created August 29, 2023 01:36
DuckDB over SSH
"""
Run DuckDB query on over SSH to avoid scanning full file set on a remote server,
and make results availble to local DuckDB process
"""
import io, paramiko, duckdb
sql = "SELECT * FROM read_json_auto('/path/to/data.json')"
cmd = f'duckdb -csv -s "{sql}"'
@kmatt
kmatt / pyvendor.sh
Created August 18, 2023 22:31
Vendor Python modules from Pip
pip download --no-deps --dest vendor -r requirements.txt
@kmatt
kmatt / zig-cross-compile-windows.sh
Last active July 18, 2023 14:47
Cross compile to Windows using Zig
# On Linux or macOS
export CC="zig cc -target x86_64-windows-gnu"
export CXX="zig c++ -target x86_64-windows-gnu"
./configure --host x86_64-windows-gnu
make
@kmatt
kmatt / build-spark-pip.sh
Created July 12, 2023 23:44
PySpark build
# https://spark.apache.org/docs/3.4.1/building-spark.html
export MAVEN_OPTS="-Xss64m -Xmx2g -XX:ReservedCodeCacheSize=1g"
#./dev/make-distribution.sh --name custom-spark --pip --r --tgz -Psparkr -Phive -Phive-thriftserver
./dev/make-distribution.sh --name custom-spark --pip --tgz
#python setup.py sdist
pip install python/dist/pyspark-3.4.1.tar.gz
@kmatt
kmatt / pg_podman.sh
Created March 21, 2023 20:07
Podman / Docker PostgreSQL on Windows 10
podman pull docker.io/library/postrges
podman volume create pg_data
podman run -dt --name postgres \
-e POSTGRES_PASSWORD=*** \
--mount type=volume,src=pg_data,target=/var/lib/postgresql/data \
-p 5432:5432 \
postgres
@kmatt
kmatt / slugify.postgres.sql
Last active February 1, 2023 12:51 — forked from abn/slugify.postgres.sql
A slugify function for postgres
-- original source: https://medium.com/adhawk-engineering/using-postgresql-to-generate-slugs-5ec9dd759e88
-- https://www.postgresql.org/docs/9.6/unaccent.html
CREATE EXTENSION IF NOT EXISTS unaccent;
CREATE OR REPLACE FUNCTION public.slugify(v TEXT) RETURNS TEXT
LANGUAGE plpgsql
STRICT IMMUTABLE AS
$function$
BEGIN
@kmatt
kmatt / clojure-learning-list.md
Created January 18, 2023 04:09 — forked from ssrihari/clojure-learning-list.md
An opinionated list of excellent Clojure learning materials

An opinionated list of excellent Clojure learning materials

These resources (articles, books, and videos) are useful when you're starting to learn the language, or when you're learning a specific part of the language. This an opinionated list, no doubt. I've compiled this list from writing and teaching Clojure over the last 10 years.

  • 🔴 Mandatory (for both beginners and intermediates)
  • 🟩 For beginners
  • 🟨 For intermediates

Table of contents

  1. Getting into the language
@kmatt
kmatt / mssql_insert_json.py
Created January 17, 2023 02:37 — forked from gordthompson/mssql_insert_json.py
Alternative to_sql() *method* for mssql+pyodbc
# Alternative to_sql() *method* for mssql+pyodbc or mssql+pymssql
#
# adapted from https://pandas.pydata.org/docs/user_guide/io.html#io-sql-method
import json
import pandas as pd
import sqlalchemy as sa
def mssql_insert_json(table, conn, keys, data_iter):