Skip to content

Instantly share code, notes, and snippets.

View LiutongZhou's full-sized avatar
🏠
Working

Liutong Zhou LiutongZhou

🏠
Working
  • Apple
  • New York
View GitHub Profile
@LiutongZhou
LiutongZhou / memory_efficient_training.md
Last active July 11, 2023 15:37
Memory Efficient Training of LLMs
"""Data Strutures that extend OrderedDict"""
from collections import Counter, OrderedDict
from typing import Any, Hashable, Optional, Tuple, List
from hypothesis import given, strategies as st
__all__ = ["OrderedDefaultDict", "MinMaxCounter"]
class OrderedDefaultDict(OrderedDict):
@LiutongZhou
LiutongZhou / docker_tips.md
Last active February 10, 2023 15:54
Docker Tips

Docker Tips

Move default docker storage to another location

nano /etc/docker/daemon.json

## add this config
{
"data-root": "/newlocation"
}
@LiutongZhou
LiutongZhou / document_project.md
Last active August 18, 2022 19:52
Document a Project

How to document a project

How to update the docs and publish to the Home Page?

Prerequisites One-time installation of dependencies
python3 -m pip install -U jupyter-book ghp-import
@LiutongZhou
LiutongZhou / dist-train.md
Last active February 10, 2023 15:55
Large-Scale Distributed Data and Model Parallel Training

Large-Scale Distributed Data and Model Parallel Training

Data Streaming

image

FastFile Mode

sagemaker.inputs.TrainingInput(S3_INPUT_FOLDER, input_mode='FastFile') 
@LiutongZhou
LiutongZhou / bitmask.py
Last active February 6, 2022 04:57
BitMask
"""BitMask"""
class BitMask:
__slots__ = ("size", "mask")
def __init__(self, size: int = 16):
"""Create a bit mask to store (size) 0/1 status"""
self.size = size
self.mask = 1 << size
@LiutongZhou
LiutongZhou / heap.py
Last active June 6, 2022 01:08
Priority queues
"""MinHeap and MaxHeap (Optimized Implementation)"""
from abc import ABC, abstractmethod
from collections import Counter, UserList
from functools import singledispatchmethod
from heapq import (
_heapify_max,
_heappop_max,
_heapreplace_max,
_siftdown,
_siftdown_max,
"""UnionFind (Disjoint Sets)"""
from typing import Optional, Iterable, Hashable, Any
class UnionFind:
def __init__(
self, initial_disjoint_items: Optional[Iterable[Hashable]] = None
):
"""Initialize a UnionFind of disjoint sets"""
@LiutongZhou
LiutongZhou / sampling.py
Created February 9, 2021 05:48
Time-and-Space-Efficient Sampling Methods
""" Time-and-Space-Efficient Sampling Methods"""
from typing import Iterable
from random import random, randint, seed
from math import exp, log, floor
from itertools import islice
__author__ = "Liutong Zhou"
def take(iterable, n: int) -> list:
@LiutongZhou
LiutongZhou / remote_jupyter_setup.md
Last active April 7, 2024 16:50
Setup Cloud9 on EC2 and remote Jupyter

Setup Cloud9 on EC2 and remote Jupyter

Create an EC2 instance on AWS

  • Launch: t3.2xlarge ($0.33/h) / m5d.4xlarge ($0.904) / g4dn.4xlarge ($1.2/h) / p3.2xlarge ($3.02/h)
  • Image: Deep Learning AMI (Ubuntu 22.04)
  • Configure Security Group:
    • open custom TCP and port 9999
    • open HTTPS, HTTP to anywhere
  • Attach an Elastic IP to the instance

ssh into EC2 from MobaXterm and run