Skip to content

Instantly share code, notes, and snippets.

drin /
Created Oct 14, 2022
Trying to test that HashMultiColumn produces expected hash values for int32_t input values

A simplified version of HashIntImp for testing:

// hash_int based on (672431b)
template <typename T>
uint64_t hash_int(T val) {
  constexpr uint64_t int_const = 11400714785074694791ULL;
  uint64_t cast_val            = static_cast<uint64_t>(val);

  return static_cast<uint64_t>(BYTESWAP(cast_val * int_const));
drin /
Last active Aug 30, 2022
Some Arrow Benchmarking
// A version that is directly comparable to
static void GreaterEqual(benchmark::State& state) { // NOLINT non-const reference
constexpr int64_t test_size = 10000;
constexpr int64_t max_val = std::numeric_limits<int64_t>::max();
auto test_vals = benchmark_rng.Int64(test_size, 0, max_val);
auto test_ints = std::static_pointer_cast<arrow::Int64Array>(test_vals);
while (state.KeepRunning()) {
arrow::BooleanBuilder builder;
drin /
Last active Mar 11, 2022
Reproducible example of Arrow compute functions on composed and decomposed table

"Time by slice" is total time, summed from running the function on each slice. "Time by table" is total time, from running the function on a table created by concatenating each slice together.

Table ID Columns Rows Rows (slice) Slice count Time by slice (ms) Time by total (ms)
E-GEOD-100618 415 20631 299 69 644.065 410
E-GEOD-76312 2152 27120 48 565 25607.927 2953
E-GEOD-106540 2145 24480 45 544 25193.507 3088
drin /
Created Oct 15, 2021
Random python example
class ExampleClass:
class_var = 'Class Variable'
def __init__(self, req_param, def_param=10, **kwargs):
# calling super class "constructor" is *optional*
self.required_arg = req_param
self.optional_arg = def_param
drin / check-pyarrow-deps.bash
Last active Sep 20, 2021
Arrow from C++ and python
View check-pyarrow-deps.bash
(my-poetry-venv) 14:17 >> python
Python 3.9.6 (default, Jun 30 2021, 10:22:16)
[GCC 11.1.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyarrow
>>> pyarrow.__file__
>>> quit()
(my-poetry-venv) 15:00 >> ldd <path-to-my-poetry-venv>/lib/python3.9/site-packages/pyarrow/
drin / test.r
Last active Jul 14, 2021
R code for using skytether via python
View test.r
# ------------------------------
# Dependencies
# >> Set python interpreter (rely on pyenv and poetry)
use_python(Sys.which('python'), required=TRUE)
# >> Python dependencies (via reticulate)
skytether <- import('skytether')
Last active Jun 8, 2021
Using Arrow in C++ and R
Package: skytethr
Title: Integration to 'Skytether-singlecell'
Version: 0.1.0
LinkingTo: cpp11, boostfs, arrow
SystemRequirements: C++11
drin / Vagrantfile
Last active Mar 3, 2021
Almost default content of VagrantFile for ubuntu 21.04
View Vagrantfile
# -*- mode: ruby -*-
# vi: set ft=ruby :
# All Vagrant configuration is done below. The "2" in Vagrant.configure
# configures the configuration version (we support older styles for
# backwards compatibility). Please don't change it unless you know what
# you're doing.
Vagrant.configure("2") do |config|
# The most common configuration options are documented and commented below.
# For a complete reference, please see the online documentation at
drin / ArrowAptInstall.bash
Last active Oct 2, 2021
Install arrow using apt
View ArrowAptInstall.bash
sudo apt update
sudo apt install -y -V ca-certificates lsb-release wget
# modified per comment; since bintray is retired
wget$(lsb_release --id --short | tr 'A-Z' 'a-z')/apache-arrow-apt-source-latest-$(lsb_release --codename --short).deb
sudo apt install -y -V ./apache-arrow-apt-source-latest-$(lsb_release --codename --short).deb
sudo apt update
sudo apt install -y -V libarrow-dev # For C++
sudo apt install -y -V libarrow-dataset-dev # For Arrow Dataset C++
drin /
Created Aug 25, 2016
drin keybase proof

Keybase proof

I hereby claim:

  • I am drin on github.
  • I am octalene ( on keybase.
  • I have a public key ASBg4Nd2YKsLAi1ZVEpS5SAOZ5LfP-2PtoLl2Z82jhkalAo

To claim this, I am signing this object: