Skip to content

Instantly share code, notes, and snippets.

View josephwinston's full-sized avatar

Joseph Winston josephwinston

View GitHub Profile
@jarohen
jarohen / rafting.clj
Last active September 12, 2023 16:12
A quick hack at actually implementing the Raft algorithm to aid my learning - thanks @bbengfort and @krisajenkins for the podcast!
(ns rafting
(:require [clojure.tools.logging :as log])
(:import java.lang.AutoCloseable
(java.util.concurrent CompletableFuture Executors TimeUnit)))
(defn new-timeout-ms []
(+ (System/currentTimeMillis)
200
(rand-int 150)))
@roaramburu
roaramburu / tpch_sf1000_single_gpu.ipynb
Last active October 10, 2020 06:03
An example notebook running a 1TB query on a single GPU using https://app.blazingsql.com
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@thomwolf
thomwolf / loading_wikipedia.py
Last active January 18, 2024 14:04
Load full English Wikipedia dataset in HuggingFace nlp library
import os; import psutil; import timeit
from datasets import load_dataset
mem_before = psutil.Process(os.getpid()).memory_info().rss >> 20
wiki = load_dataset("wikipedia", "20200501.en", split='train')
mem_after = psutil.Process(os.getpid()).memory_info().rss >> 20
print(f"RAM memory used: {(mem_after - mem_before)} MB")
s = """batch_size = 1000
for i in range(0, len(wiki), batch_size):
@lucj
lucj / k3s-multipass.sh
Created December 17, 2019 21:16
Setup a k3s kubernetes cluster using Multipass VMs
for node in node1 node2 node3;do
multipass launch -n $node
done
# Init cluster on node1
multipass exec node1 -- bash -c "curl -sfL https://get.k3s.io | sh -"
# Get node1's IP
IP=$(multipass info node1 | grep IPv4 | awk '{print $2}')
@lizthegrey
lizthegrey / attributes.rb
Last active February 24, 2024 14:11
Hardening SSH with 2fa
default['sshd']['sshd_config']['AuthenticationMethods'] = 'publickey,keyboard-interactive:pam'
default['sshd']['sshd_config']['ChallengeResponseAuthentication'] = 'yes'
default['sshd']['sshd_config']['PasswordAuthentication'] = 'no'
@terry90
terry90 / Dockerfile
Last active February 5, 2022 16:29
Rust Dockerfile to build really small containers with postgres and SSL (~20Mo with rocket and diesel). Dependencies are cached for faster builds.
FROM clux/muslrust as build
WORKDIR /app/
# Deps caching begins
COPY Cargo.toml .
COPY Cargo.lock .
RUN mkdir src
RUN echo "fn main() {}" > src/main.rs

Generating Flame Graphs for Apache Spark

Flame graphs are a nifty debugging tool to determine where CPU time is being spent. Using the Java Flight recorder, you can do this for Java processes without adding significant runtime overhead.

When are flame graphs useful?

Shivaram Venkataraman and I have found these flame recordings to be useful for diagnosing coarse-grained performance problems. We started using them at the suggestion of Josh Rosen, who quickly made one for the Spark scheduler when we were talking to him about why the scheduler caps out at a throughput of a few thousand tasks per second. Josh generated a graph similar to the one below, which illustrates that a significant amount of time is spent in serialization (if you click in the top right hand corner and search for "serialize", you can see that 78.6% of the sampled CPU time was spent in serialization). We used this insight to spee

Build tensorflow on OSX with NVIDIA CUDA support (GPU acceleration)

These instructions are based on Mistobaan's gist but expanded and updated to work with the latest tensorflow OSX CUDA PR.

Requirements

OS X 10.10 (Yosemite) or newer

@josephwinston
josephwinston / pf.md
Created July 25, 2014 23:57 — forked from ryanzhou/pf.md

Getting Pow to work in OS X Yosemite

Some parts taken from: https://gist.github.com/kujohn/7209628

ipfw is officially deprecated and removed in OS X Yosemite. Pow requires another program pf to handle the port forwarding.

1. Anchor file

Create file /etc/pf.anchors/pow