Skip to content

Instantly share code, notes, and snippets.

View casperlehmann's full-sized avatar

Casper Lehmann casperlehmann

View GitHub Profile
@justinnaldzin
justinnaldzin / spark_dataframe_size_estimator.py
Created July 18, 2018 19:29
Estimate size of Spark DataFrame in bytes
# Function to convert python object to Java objects
def _to_java_object_rdd(rdd):
""" Return a JavaRDD of Object by unpickling
It will convert each Python object into Java object by Pyrolite, whenever the
RDD is serialized in batch or not.
"""
rdd = rdd._reserialize(AutoBatchedSerializer(PickleSerializer()))
return rdd.ctx._jvm.org.apache.spark.mllib.api.python.SerDe.pythonToJava(rdd._jrdd, True)
# Convert DataFrame to an RDD
@CurtHagenlocher
CurtHagenlocher / RowExpression.From.pq
Created March 6, 2018 23:26
RowExpression.From as JSON
let
Value.FixType = (value, optional depth) =>
let
nextDepth = if depth = null then 3 else depth - 1,
result = if depth = 0 then null
else if value is type then TextType(value)
else if value is table then Table.TransformColumns(value, {}, @Value.FixType)
else if value is list then List.Transform(value, each @Value.FixType(_, nextDepth))
else if value is record then
Record.FromList(List.Transform(Record.ToList(value), each @Value.FixType(_, nextDepth)), Record.FieldNames(value))
@ibraheem4
ibraheem4 / postgres-brew.md
Last active July 18, 2024 00:18 — forked from sgnl/postgres-brew.md
Installing Postgres via Brew (OSX)

Installing Postgres via Brew

Pre-Reqs

Brew Package Manager

In your command-line run the following commands:

  1. brew doctor
  2. brew update
@noelboss
noelboss / git-deployment.md
Last active July 16, 2024 09:50
Simple automated GIT Deployment using Hooks

Simple automated GIT Deployment using GIT Hooks

Here are the simple steps needed to create a deployment from your local GIT repository to a server based on this in-depth tutorial.

How it works

You are developing in a working-copy on your local machine, lets say on the master branch. Most of the time, people would push code to a remote server like github.com or gitlab.com and pull or export it to a production server. Or you use a service like deepl.io to act upon a Web-Hook that's triggered that service.

@iconara
iconara / queries.sql
Last active November 13, 2023 22:26
Low level Redshift cheat sheet
-- Table information like sortkeys, unsorted percentage
-- see http://docs.aws.amazon.com/redshift/latest/dg/r_SVV_TABLE_INFO.html
SELECT * FROM svv_table_info;
-- Table sizes in GB
SELECT t.name, COUNT(tbl) / 1000.0 AS gb
FROM (
SELECT DISTINCT datname, id, name
FROM stv_tbl_perm
JOIN pg_database ON pg_database.oid = db_id

Moved

Now located at https://github.com/JeffPaine/beautiful_idiomatic_python.

Why it was moved

Github gists don't support Pull Requests or any notifications, which made it impossible for me to maintain this (surprisingly popular) gist with fixes, respond to comments and so on. In the interest of maintaining the quality of this resource for others, I've moved it to a proper repo. Cheers!

@higemaru
higemaru / gist:5863873
Last active April 19, 2021 11:25
GIT_COMMITTER_DATE / GIT_AUTHOR_DATE
export GIT_COMMITTER_DATE="Fri Jun 21 20:26:41 2013 +0900"
git commit --amend --date "Fri Jun 21 20:26:41 2013 +0900" -m 'fix committer_date and author_date'
unset GIT_COMMITTER_DATE
export GIT_COMMITTER_DATE="2013-08-01 23:01:01 +0900"
git commit --amend --date "2013-08-01 23:01:01 +0900" -m 'fix committer_date and author_date'
unset GIT_COMMITTER_DATE