Skip to content

Instantly share code, notes, and snippets.

View muellerzr's full-sized avatar

Zach Mueller muellerzr

View GitHub Profile
@muellerzr
muellerzr / base_drivers.txt
Created April 15, 2024 17:59
P2P tests with 4090's
[P2P (Peer-to-Peer) GPU Bandwidth Latency Test]
Device: 0, NVIDIA GeForce RTX 4090, pciBusID: 1, pciDeviceID: 0, pciDomainID:0
Device: 1, NVIDIA GeForce RTX 4090, pciBusID: 2, pciDeviceID: 0, pciDomainID:0
Device=0 CANNOT Access Peer Device=1
Device=1 CANNOT Access Peer Device=0
***NOTE: In case a device doesn't have P2P access to other one, it falls back to normal memcopy procedure.
So you can see lesser Bandwidth (GB/s) and unstable Latency (us) in those cases.
P2P Connectivity Matrix
import builtins
import fcntl
import os
import socket
import torch
import torch.distributed as dist
print("STARTED")
def print(*args, **kwargs):
@muellerzr
muellerzr / test.py
Created September 15, 2023 18:29
Model memory stuff
import torch
from transformers import AutoModel, AutoConfig, AutoModelForSequenceClassification
def get_model_memory(model: torch.nn.Module):
"""
Returns the memory usage of the given model
"""
total_memory = 0
for param in model.parameters():
total_memory += param.numel() * param.element_size()
@muellerzr
muellerzr / hide_sidebar.js
Created June 8, 2023 22:42
Javascript which will hide semantically-versioned sidebars in Quarto. Designed to be used in conjunction with nbquarto/referenced from it
/**
* Enables semantic versioning through careful sidebar menu item selection.
* Hide sidebar menu items that are not related to the current page that is open.
* Assumes a directory structure of:
* - version_1
* - page_1
* - page_2
* - version_2
* - page_2
* - page_3
@muellerzr
muellerzr / scratchpad.ipynb
Last active May 9, 2023 09:10
Intel XPU issue T4 Colab
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@muellerzr
muellerzr / checkpointing.py
Created May 9, 2023 09:05
Checkpointing script to test cuda device
# coding=utf-8
# Copyright 2021 The HuggingFace Inc. team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
@muellerzr
muellerzr / scratchpad.ipynb
Created February 13, 2023 21:02
Big Model Memory Leak
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@muellerzr
muellerzr / boundaries.md
Last active February 23, 2023 01:01
Tweets I'm saving

By Janine Sickmeyer

One of the best things I've learned in therapy is how to establish and enforce strong boundaries.

Protect your time:

• if it’s not a hell yes!, it’s a no. • take breaks to come back stronger • slow down to find clarity of purpose • you don’t NEED a reason to say no

@muellerzr
muellerzr / xla_scriptv2.py
Last active November 8, 2022 02:04
Second version
import torch_xla.distributed.xla_multiprocessing as xmp
import torch_xla.core.xla_model as xm
def _mp_fn(index):
print("I did something!")
xm.rendezvous('checking_out')
if __name__ == "__main__":
xmp.spawn(_mp_fn, args=(), nprocs=8)
import torch_xla.core.xla_model as xm
import torch_xla.core.xla_env_vars as xenv
import os
def main(args=None):
print("I did something!")
if os.getenv(xenv.HOST_WORLD_SIZE, None) and xm.xrt_world_size() > 1:
xm.rendezvous("checking_out")
if __name__ == "__main__":