Skip to content

Instantly share code, notes, and snippets.

View jpivarski's full-sized avatar

Jim Pivarski jpivarski

  • Princeton, IRIS-HEP, PyHEP, Scikit-HEP
View GitHub Profile
@jpivarski
jpivarski / kerchunk-root-buffer-layout.svg
Created February 17, 2022 02:06
Description of the structure of a ROOT TBasket for the kerchunk meeting
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@jpivarski
jpivarski / CMakeLists.txt
Last active March 31, 2022 12:01
Very provisional ctypes interface to Clang Incremental
# Author: Vassil Vassilev
project(aarray-example)
cmake_minimum_required(VERSION 3.10)
#conda install -c conda-forge/label/llvm_rc clangdev=14.0.0.rc2
set(conda_path "/home/jpivarski/mambaforge/envs/vassil-clang-python/")
set(CMAKE_FIND_PACKAGE_SORT_ORDER NATURAL)
set(CMAKE_FIND_PACKAGE_SORT_DIRECTION DEC)
@jpivarski
jpivarski / argo-demo.ipynb
Last active April 20, 2022 02:09
Argo-Awkward Array demo
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@jpivarski
jpivarski / Awkward Arrays in spatialpandas.md
Last active June 21, 2022 18:32
Awkward Arrays in spatialpandas

I managed to iterate over Awkward Arrays and rasterize the NYC buildings as polygons. The spatialpandas code is pretty well integrated with the ragged data structures you've built; there's a lot of code that twiddles offset arrays. I couldn't use the build_polygon function directly, but ported over enough of it into my own Numba-compiled function to reproduce the output.

These are Matplotlib's imshow displays of images made by iterating over Awkward Arrays; the axes are flipped from the normal longitude, latitude because I'm just dumping the array as an image, but I verified on one complex building that I am exactly reproducing spatialpandas's output (including the short-circuit code paths, in which a polygon is smaller than a pixel). The first is low-resolution and the second is high-resolution, the minimum and maximum number of pixels in the performance studies later in this email.

@jpivarski
jpivarski / _Converting Argo data from NetCDF4 to Parquet.md
Created May 2, 2022 18:52
Converting Argo data from NetCDF4 to Parquet

First, I fetched all of the Argo data up through 2021 via ftp:

wget -r ftp://ftp.ifremer.fr/ifremer/argo

# wait a long time

tree ftp.ifremer.fr
└── ifremer
 └── argo
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

Uproot version 5.0.0

Uproot version 5 has a few major new features, one removal (uproot.lazy), and is based on Awkward Array version 2 instead of version 1.

uproot.lazy → uproot.dask

@kkothari2001 upgraded Uproot from Awkward version 1 to version 2, the major part of which was replacing uproot.lazy, which is based on Awkward 1's virtual and partitioned lazy arrays, with the new Dask collection, dask-awkward. The entry point for this function is uproot.dask.

@kkothari2001 also simplified Uproot's Pandas backend, which used to "explode" ragged arrays from ROOT into Pandas DataFrames with a non-trivial MultiIndex. Now, it takes advantage of awkward-pandas to put ragged (and more complex) Awkward Arrays directly into Pandas columns.