Skip to content

Instantly share code, notes, and snippets.

View ianmcook's full-sized avatar

Ian Cook ianmcook

View GitHub Profile
@ianmcook
ianmcook / vcpkg_arrow_cpp_deps_install.txt
Last active January 25, 2021 16:25
output of vcpkg install arrow cpp dependencies
C:\Users\ian>vcpkg install --clean-after-build --triplet x64-windows --x-manifest-root C:\Users\ian\arrow\cpp
A suitable version of cmake was not found (required v3.19.2). Downloading portable cmake v3.19.2...
Downloading cmake...
https://github.com/Kitware/CMake/releases/download/v3.19.2/cmake-3.19.2-win32-x86.zip -> C:\vcpkg\downloads\cmake-3.19.2-win32-x86.zip
Extracting cmake...
A suitable version of 7zip was not found (required v18.1.0). Downloading portable 7zip v18.1.0...
Downloading 7zip...
https://www.nuget.org/api/v2/package/7-Zip.CommandLine/18.1.0 -> C:\vcpkg\downloads\7-zip.commandline.18.1.0.nupkg
Extracting 7zip...
A suitable version of nuget was not found (required v5.5.1). Downloading portable nuget v5.5.1...
@ianmcook
ianmcook / nc_tornado_warnings_map.R
Last active March 18, 2021 22:07
Create map of North Carolina tornado alert polygons and home location
install.packages(c("ggmap", "leaflet", "sp"))
devtools::install_github("ianmcook/weatherAlerts")
devtools::install_github("ianmcook/weatherAlertAreas")
# geocode home address
library(ggmap)
register_google("ENTER_GCP_GEOCODING_API_KEY_HERE")
home <- geocode("University of North Carolina at Chapel Hill")
# or specify coordinates of home
@ianmcook
ianmcook / enquo_helpers.R
Last active April 10, 2021 03:48
rlang::enquo() helpers for eager evaluation and idempotence
# enquo() helpers for eager evaluation and idempotence
# wrap eager() around enquo() to evaluate the quosure immediately in the calling
# environment *if* it can do so without error, otherwise return the quosure
eager <- function(quo) {
val <- try(eval_tidy(quo), silent = TRUE)
if (inherits(val, "try-error")) {
quo
} else {
val
@ianmcook
ianmcook / hive-jdbc-example.R
Last active April 26, 2021 05:51
Query Apache Hive from R using JDBC
# Copyright 2018 Cloudera, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
@ianmcook
ianmcook / create_and_print_arrow_table.cpp
Last active June 2, 2022 14:01
Create and print an Arrow Table in C++
#include <iostream>
#include <arrow/api.h>
#include <arrow/result.h>
#include <arrow/compute/api.h>
arrow::Status Execute() {
arrow::Int32Builder int_builder;
ARROW_RETURN_NOT_OK(int_builder.Append(1));
ARROW_RETURN_NOT_OK(int_builder.Append(2));
ARROW_RETURN_NOT_OK(int_builder.Append(3));
@ianmcook
ianmcook / clean_github_jira_ids.R
Last active October 26, 2022 21:26
Match Apache Arrow Jira user accounts with GitHub user accounts
# run this script second
library(dplyr)
df <- read.csv("dirty.csv")
agg <- df %>%
group_by(jira, github) %>%
summarise(n = n(), .groups = "keep") %>%
ungroup() %>%
@ianmcook
ianmcook / duckdb_ibis_example.py
Created January 24, 2023 18:01
Ibis + DuckDB example
# pip install 'ibis-framework[duckdb]'
import pandas as pd
import ibis
from ibis import _
# create a pandas DataFrame and write it to a Parquet file
df = pd.DataFrame(data={'repo': ['pandas', 'duckdb', 'ibis'],
'stars': [36622, 8074, 2336]})
df.to_parquet('repo_stars.parquet')
@ianmcook
ianmcook / acero_execplan.cpp
Last active January 30, 2023 21:55
Create and execute an Acero ExecPlan
#include <iostream>
#include <arrow/api.h>
#include <arrow/result.h>
#include <arrow/compute/api.h>
#include <arrow/compute/exec/exec_plan.h>
arrow::Status ExecutePlanAndCollectAsTable(
std::shared_ptr<arrow::compute::ExecPlan> plan,
std::shared_ptr<arrow::Schema> schema,
arrow::AsyncGenerator<std::optional<arrow::compute::ExecBatch>> sink_gen) {
@ianmcook
ianmcook / ibis_trino.py
Last active April 9, 2023 12:02
Simple Ibis Trino demo
# before running:
# 1. install Ibis and its Trino backend: https://ibis-project.org/backends/Trino/
# 2. pull and run the Trino docker container: https://trino.io/docs/current/installation/containers.html
import ibis
from ibis import _
# connect to Trino
conn = ibis.trino.connect(database='memory', schema='default')
@ianmcook
ianmcook / ibis_snowflake_tpc-h_1.py
Last active April 12, 2023 18:07
Ibis Snowflake TPC-H Query 1
# before running:
# 1. install Ibis and its Snowflake backend: https://ibis-project.org/backends/Snowflake/
# 2. create and activate a Snowflake trial account
# 3. set environment variables SNOWSQL_USER, SNOWSQL_PWD, SNOWSQL_ACCOUNT
import os
import ibis
from ibis import _
ibis.options.interactive = True