Skip to content

Instantly share code, notes, and snippets.

View ExpandingMan's full-sized avatar

ExpandingMan

  • Port Jefferson, NY
View GitHub Profile
@ExpandingMan
ExpandingMan / deserialization_performance.jl
Created September 22, 2016 18:43
Julia deserialization performance (compared with Python pickle protocol 3)
using DataFrames
using PyCall
# using DatasToolbox
import Base.serialize
import Base.deserialize
const NROWS = 2*10^6
const FILENAME = "devtest1.jbin"
const PICKLE_FILENAME = "devtest1.pkl"
@ExpandingMan
ExpandingMan / feather_performance.jl
Created October 4, 2016 16:52
testing performance of Feather.jl
using DataFrames
using Feather
using PyCall
@pyimport feather as pyfeather
# using DatasToolbox
const NROWS = 2*10^6
const FILENAME = "devtest1.feather"
const PYTHON_FILENAME = "pythontest.feather"
@ExpandingMan
ExpandingMan / convertaxis.jl
Created January 19, 2017 21:33
Converting `TimeArray`s to have `Float64` Axes
typealias Period Union{Dates.TimePeriod, Dates.DatePeriod}
# definition of year as 365 days is consistent with Dates package
convert_ms(::Type{Dates.Millisecond}, t::Number) = t
convert_ms(::Type{Dates.Second}, t::Number) = t/1000
convert_ms(::Type{Dates.Minute}, t::Number) = t/(1000*60)
convert_ms(::Type{Dates.Hour}, t::Number) = t/(1000*60*60)
convert_ms(::Type{Dates.Day}, t::Number) = t/(1000*60*60*24)
convert_ms(::Type{Dates.Year}, t::Number) = t/(1000*60*60*24*365)
@ExpandingMan
ExpandingMan / automatetestPOC.jl
Last active December 21, 2017 17:31
rough, primitive proof-of-concept for automated random unit testing in Julia
using Base.Test
using MacroTools
"""
argnametype(arg::Expr)
Given an argument of the form `x::DType` extracts the argument name (`x`) and
data type (`DType`) as a tuple.
"""
function argnametype(arg::Expr)
@ExpandingMan
ExpandingMan / init.vim
Last active August 3, 2018 20:20
cvim configuration
let searchengine duckduckgo = "https://duckduckgo.com/?q=%s&t=vivaldi"
let defaultengine = "duckduckgo"
" key bindings
map d x
@ExpandingMan
ExpandingMan / write_test_arrow.py
Created March 19, 2019 00:49
writing some arrow test data
import pyarrow as pa
v = pa.array([1,2,3,4])
batch = pa.RecordBatch.from_arrays([v], ["this_is_the_column_name"])
sink = pa.BufferOutputStream()
writer = pa.RecordBatchStreamWriter(sink, batch.schema)
@ExpandingMan
ExpandingMan / jloptshack.jl
Created May 30, 2020 18:47
a hack to turn on trace compiling while Julia is running
function hacked_jlopts(jlopts::Base.JLOptions, str::String)
f = (n, arg) -> n == :trace_compile ? pointer(str) : arg
GC.@preserve str Base.JLOptions((f(n, getproperty(jlopts, n)) for n ∈ fieldnames(Base.JLOptions))...)
end
function hack_jlopts(str::String)
ptr = convert(Ptr{Base.JLOptions}, cglobal(:jl_options))
jlopts = hacked_jlopts(Base.JLOptions(), str)
unsafe_store!(ptr, jlopts)
jlopts
@ExpandingMan
ExpandingMan / ensemble_heisenbug.jl
Created July 8, 2021 02:11
standalone geodesic EnsembleProblem that worked fine
using DifferentialEquations, LinearAlgebra, ThreadsX, DiffEqGPU, StaticArrays, ThreadsX, BenchmarkTools
struct DnegWormhole
ρ::Float64
a::Float64
M::Float64
end
const DEFAULT_WORMHOLE = Ref{DnegWormhole}(DnegWormhole(1.0, 0.5, 0.5))
default_wormhole() = DEFAULT_WORMHOLE[]
@ExpandingMan
ExpandingMan / ThePlan.md
Created July 14, 2021 00:12
plan for distributed tables in Julia

The Plan

The ultimate goal is to have a package or set of packages for large-scale distributed computing on tabular data in Julia that works "out of the box". We would like to be able to replace processes that run in e.g. Apache Spark or Python's dask.

Examples of things we'd like to be able to do:

  • Get table metadata for a table that is spread across a hundred CSV files in HDFS.
  • Join a 10^10 row table which is stored as parquet files on S3 buckets with a 100 row table that we create locally in memory, perform a groupby operation and save to a new table as parquet files on S3.
Computer Information:
Manufacturer: Unknown
Model: Unknown
Form Factor: Desktop
No Touch Input Detected
Processor Information:
CPU Vendor: AuthenticAMD
CPU Brand: AMD Ryzen 7 3800X 8-Core Processor
CPU Family: 0x17