Skip to content

Instantly share code, notes, and snippets.

@tahashmi
tahashmi / arrow_flight_dremio.py
Last active February 9, 2023 20:36 — forked from koolay/arrow_flight_dremio.py
demo of arrow-flight+dremio+vaex
from collections import namedtuple
import vaex
import time
import orjson
import os
import psutil
from pyarrow import flight
import pyarrow as pa
@koolay
koolay / arrow_flight_dremio.py
Created July 28, 2020 04:01
demo of arrow-flight+dremio+vaex
from collections import namedtuple
import vaex
import time
import orjson
import os
import psutil
from pyarrow import flight
import pyarrow as pa
@willirath
willirath / SLURMCluster_vs_Singularity.ipynb
Last active December 11, 2023 14:15
Dask-Jobqueue SLURMCluster with Singularity
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@songpon
songpon / ubuntu-install-dremio.sh
Created February 5, 2020 04:07
ubuntu 18.04 install dremio driver
wget https://download.dremio.com/odbc-driver/1.4.2.1003/dremio-odbc-1.4.2.1003-1.x86_64.rpm
sudo apt-get install alien unixodbc-dev -y
sudo alien dremio-odbc-1.4.2.1003-1.x86_64.rpm
sudo dpkg -i dremio-odbc_1.4.2.1003-2_amd64.deb
@cjnolet
cjnolet / cuml-kmeans-mnmg-api.md
Last active August 17, 2022 05:35
Simple example of cuML's K-Means Single-GPU (SG) and Multi-Node Multi-GPU (MNMG) APIs compared to Scikit-learn and Dask-ML

Comparing cuML K-Means API Against Scikit-learn & Dask-ML

First, a quick code example of K-Means in Scikit-learn

from sklearn.datasets import make_blobs
from sklearn.cluster import KMeans

n_centers = 5

X, _ = make_blobs(n_samples=10000, n_centers=n_centers)
@jasonrig
jasonrig / run_spark_cluster.sh
Created August 9, 2019 03:47
Example SLURM job script to start a Spark cluster
#!/bin/bash
#SBATCH --job-name spark-cluster
#SBATCH --account=qh82
#SBATCH --time=02:00:00
# --- Master resources ---
#SBATCH --nodes=1
#SBATCH --mem-per-cpu=1G
#SBATCH --cpus-per-task=1
#SBATCH --ntasks-per-node=1
# --- Worker resources ---
#!/usr/bin/env bash
set -eu
PWD="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
SRC_DIR=$(realpath "${PWD}/..")
CXX_SRC=${SRC_DIR}/cpp
# The following can be set
: "${CMAKE:=cmake}"
@andyweizhao
andyweizhao / cuda_installation_on_ubuntu_18.04
Last active October 11, 2021 17:56 — forked from Mahedi-61/cuda_11.8_installation_on_Ubuntu_22.04
cuda 9.0 complete installation procedure for ubuntu 18.04 LTS
#!/bin/bash
## This gist contains step by step instructions to install cuda v9.0 and cudnn 7.3 in ubuntu 18.04
### steps ####
# verify the system has a cuda-capable gpu
# download and install the nvidia cuda toolkit and cudnn
# setup environmental variables
# verify the installation
###
@linar-jether
linar-jether / PySpark DataFrame from many small pandas DataFrames.ipynb
Created July 8, 2018 10:15
Convert a RDD of pandas DataFrames to a single Spark DataFrame using Arrow and without collecting all data in the driver.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.