Skip to content

Instantly share code, notes, and snippets.

View viirya's full-sized avatar
:octocat:

Liang-Chi Hsieh viirya

:octocat:
View GitHub Profile
@adrienbrault
adrienbrault / llama2-mac-gpu.sh
Last active July 1, 2024 05:32
Run Llama-2-13B-chat locally on your M1/M2 Mac with GPU inference. Uses 10GB RAM. UPDATE: see https://twitter.com/simonw/status/1691495807319674880?s=20
# Clone llama.cpp
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
# Build it
make clean
LLAMA_METAL=1 make
# Download model
export MODEL=llama-2-13b-chat.ggmlv3.q4_0.bin

Generating Flame Graphs for Apache Spark

Flame graphs are a nifty debugging tool to determine where CPU time is being spent. Using the Java Flight recorder, you can do this for Java processes without adding significant runtime overhead.

When are flame graphs useful?

Shivaram Venkataraman and I have found these flame recordings to be useful for diagnosing coarse-grained performance problems. We started using them at the suggestion of Josh Rosen, who quickly made one for the Spark scheduler when we were talking to him about why the scheduler caps out at a throughput of a few thousand tasks per second. Josh generated a graph similar to the one below, which illustrates that a significant amount of time is spent in serialization (if you click in the top right hand corner and search for "serialize", you can see that 78.6% of the sampled CPU time was spent in serialization). We used this insight to spee

#!/bin/bash
# Run this on This AMI on AWS:
# https://console.aws.amazon.com/ec2/v2/home?region=us-east-1#LaunchInstanceWizard:ami=ami-b36981d8
# You should get yourself a fully working GPU enabled tensorflow installation.
cd ~
# grab cuda 7.0
@dmvaldman
dmvaldman / FRPandPhilosophy.md
Last active February 23, 2024 16:24
Descartes, Berkeley and Functional Reactive Programming

Descartes, Berkeley and Functional Reactive Programming

By @dmvaldman

Functional Reactive Programming (FRP) is generating buzz as an alternative to Object Oriented Programming (OOP) for certain use cases. However, an internet search quickly leads a curious and optimistic reader into the rabbit-hole of monads, functors, and other technical jargon. I’ve since emerged from this dark and lonely place with the realization that these words are mere implementation details, and that the core concepts are far more universal. In fact, the groundwork was laid down many centuries before the first computer, and has more to do with interpretations of reality, than structuring programs. Allow me to explain.

There’s an old thought experiment that goes like this:

Tree

@silasdavis
silasdavis / MultipleOutputs.scala
Last active January 18, 2022 07:07
Wrapping OutputFormat to produce multiple outputs with hadoop MultipleOutputs
/**
* This file contains the core idea of wrapping an underlying OutputFormat with an OutputFormat
* with an augmented key that writes to partitions using MultipleOutputs (or something similar)
*/
package model.hadoop
import model.hadoop.HadoopIO.MultipleOutputer
import model.hadoop.HadoopIO.MultipleOutputer._
import org.apache.hadoop.io.{DataInputBuffer, NullWritable}
@yoavg
yoavg / lm_example
Created May 22, 2015 23:43
Unreasonable Effectiveness of LMs
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# The unreasonable effectiveness of Character-level Language Models\n",
"## (and why RNNs are still cool)\n",
"\n",
"###[Yoav Goldberg](http://www.cs.biu.ac.il/~yogo)\n",
@karpathy
karpathy / gist:587454dc0146a6ae21fc
Last active July 11, 2024 10:36
An efficient, batched LSTM.
"""
This is a batched LSTM forward and backward pass
"""
import numpy as np
import code
class LSTM:
@staticmethod
def init(input_size, hidden_size, fancy_forget_bias_init = 3):
@debasishg
debasishg / gist:8172796
Last active July 5, 2024 11:53
A collection of links for streaming algorithms and data structures

General Background and Overview

  1. Probabilistic Data Structures for Web Analytics and Data Mining : A great overview of the space of probabilistic data structures and how they are used in approximation algorithm implementation.
  2. Models and Issues in Data Stream Systems
  3. Philippe Flajolet’s contribution to streaming algorithms : A presentation by Jérémie Lumbroso that visits some of the hostorical perspectives and how it all began with Flajolet
  4. Approximate Frequency Counts over Data Streams by Gurmeet Singh Manku & Rajeev Motwani : One of the early papers on the subject.
  5. [Methods for Finding Frequent Items in Data Streams](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.9800&rep=rep1&t