Shawn Presser shawwn

## plot2pdf.py
# The public plotly graphs to include in the report. These can also be generated with `py.plot(figure, filename)`
graphs = [
    'https://plotly.com/~christopherp/308',
    'https://plotly.com/~christopherp/306',
    'https://plotly.com/~christopherp/300',
    'https://plotly.com/~christopherp/296'
]

def report_block_template(report_type, graph_url, caption=''):
    if report_type == 'interactive':

## ConvMixr.py
def ConvMixr(h,d,k,p,n):
  def A(x):
    return Sequential(x, GELU(), BatchNorm2d(h))

 class R(Sequential):
   def forward(self, x):
     return self[0](x) + x

  return Sequential(
      A(Conv2d(3,h,p,p)),

## pypi-json tensorflow.json
{
  "info": {
    "author": "Google Inc.",
    "author_email": "packages@tensorflow.org",
    "bugtrack_url": null,
    "classifiers": [
      "Development Status :: 5 - Production/Stable",
      "Environment :: GPU :: NVIDIA CUDA :: 11.0",
      "Intended Audience :: Developers",
      "Intended Audience :: Education",

## hackernews-new-comms.js
// Can be used with https://github.com/xcv58/Custom-JavaScript-for-Websites-2
// This snippet is released under the terms of the CC0 license: https://creativecommons.org/publicdomain/zero/1.0/deed.en

// Updated by sillysaurusx (v0.1.0):
// - bells stay till you mouseover each HN comment
// - use a constant localStorage key
// future plans:
// - don't bell your own comments
// - a button to un-bell the current page (or comment subtree)

## 1558M shapes.txt
~/ml/shawwn-gpt-2$ inspect-checkpoint models/1558M/model.ckpt
+ exec python3 -m tensorflow.python.tools.inspect_checkpoint --file_name=models/1558M/model.ckpt
Init Plugin
Init Graph Optimizer
Init Kernel
model/h0/attn/c_attn/b (DT_FLOAT) [4800]
model/h0/attn/c_attn/w (DT_FLOAT) [1,1600,4800]
model/h0/attn/c_proj/b (DT_FLOAT) [1600]
model/h0/attn/c_proj/w (DT_FLOAT) [1,1600,1600]
model/h0/ln_1/b (DT_FLOAT) [1600]

## TensorWtf - Sep 2 - Alt text #1
— gwern is pleased he kept meticulous notes while setting up the original box so it's barely like 30 minutes to get the entire thing installed, formatted, ssh key-based logins working, basic perf optimizations like disabling reserved root blocks done etc. now just have to wait for the sycing from the old server to the new server, which should be fairly fast - they don't seem to be in the same
20:39:03
— gwern datacenter, but it'll still be quick
20:42:18 <shawwwn> Shawn Presser
gwern: make me an account
20:42:31
Also give me your notes
20:42:40
Decided to use notion for that
20:42:46

## TensorLog - September 1st, 2021.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              1 star
            
          
                shawwn
                / TensorLog - September 1st, 2021.md
            
            
              Last active
              September 2, 2021 16:37
            
              
                Stuff Tensorfork Labs did on Sep 1st
              
          
    Some updates I posted in our AI Research Lab (aka Google Chat). Figured I'd throw 'em in a gist and tweet it out:

okay, so! I've been quiet, but not for lack of trying. Turns out it's tricky to set up an organization. We have mail.tensorfork.com, calendar.tensorfork.com, drive.tensorfork.com, groups.tensorfork.com, and sites.tensorfork.com. Groups seems like a mailing list (which reminds me that someone mentioned a mailing list a few days ago – good idea).
We also have a shared 1password vault now -- I'll be throwing all of our creds into that, so that ya'll can get in to everything.
As for "in to what, exactly?" -- now that TL meta-work is mostly out of the way, I'll be provisioning a 64TB hetzner server (probably over the weekend), shoving imagenet up to it, then tweeting it out

  
## JAX_compliation_cache.md

      
              1 file
            
          
              0 forks
            
          
              1 comment
            
          
              9 stars
            
          
                shawwn
                / JAX_compliation_cache.md
            
            
              Last active
              January 2, 2024 15:46
            
              
                JAX persistent compilation cache
              
          
    JAX released a persistent compilation cache for TPU VMs! When enabled, the cache writes compiled JAX computations to disk so they don’t have to be re-compiled the next time you start your JAX program. This can save startup time if any of y’all have long compilation times.
First upgrade to the latest jax release:
pip install -U "jax[tpu]>=0.2.18" -f https://storage.googleapis.com/jax-releases/libtpu_releases.html

Then use the following to enable the cache in your jax code:
from jax.experimental.compilation_cache import compilation_cache as cc

  
## tpunicorn-tpu-vm-support.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                shawwn
                / tpunicorn-tpu-vm-support.md
            
            
              Last active
              May 30, 2022 16:29
            
              
                `tpunicorn` TPU VM support
              
          
    Hello! I've updated tpunicorn (https://github.com/shawwn/tpunicorn) with support for TPU VMs. But before I do a full release, I was hoping that a TPU VM user (like you!) would help me test the pre-release version. If it seems to seems to work for you, let me know and I'll do a release announcement on my twitter (https://twitter.com/theshawwn) and plug TRC while I'm at it (https://blog.gpt4.org/jaxtpu).
What's tpunicorn?

I wanted an effortless way to manage all my TPUs, so I made tpunicorn. It's basically a TPU devops power tool.
Quickstart

SSH into one of your existing TPU VMs, then run:

  
## codex.json
{"codexAttemptsByUser":0,"problem":{"codexAttemptsAllowed":10,"description":"<p>Given a the contents of a CSV file as <code>csv_contents</code>, return the difference in days between the date of the earliest and the oldest entry.</p>\n\n<p>The CSV file starts with a header row, which contains at least one column called <code>Date</code>.</p>\n\n<p>You are optionally provided with the <a href=\"https://pandas.pydata.org/\"><code>pandas</code></a> library if you need it.</p>\n\n<h3>Examples</h3>\n\n<table>\n<thead>\n<tr>\n  <th>Input</th>\n  <th><code>\"Date,Price,Volume\\n2014-01-27,550.50,1387\\n2014-06-23,910.83,4361\\n2014-05-20,604.51,5870\"</code></th>\n</tr>\n</thead>\n<tbody>\n<tr>\n  <td>Output</td>\n  <td><code>147</code></td>\n</tr>\n<tr>\n  <td>Explanation</td>\n  <td>There are 147 days between 2014-01-27 and 2014-06-23.</td>\n</tr>\n</tbody>\n</table>\n","fnName":"diff_days","inputs":["(\"Date,Price,Volume\\n2014-01-27,550.50,1387\\n2014-06-23,910.83,4361\\n2014-05-20,604.51,5870\",)","('Date\\n200
	# The public plotly graphs to include in the report. These can also be generated with `py.plot(figure, filename)`
	graphs = [
	'https://plotly.com/~christopherp/308',
	'https://plotly.com/~christopherp/306',
	'https://plotly.com/~christopherp/300',
	'https://plotly.com/~christopherp/296'
	]

	def report_block_template(report_type, graph_url, caption=''):
	if report_type == 'interactive':
	def ConvMixr(h,d,k,p,n):
	def A(x):
	return Sequential(x, GELU(), BatchNorm2d(h))

	class R(Sequential):
	def forward(self, x):
	return self[0](x) + x

	return Sequential(
	A(Conv2d(3,h,p,p)),
	{
	"info": {
	"author": "Google Inc.",
	"author_email": "packages@tensorflow.org",
	"bugtrack_url": null,
	"classifiers": [
	"Development Status :: 5 - Production/Stable",
	"Environment :: GPU :: NVIDIA CUDA :: 11.0",
	"Intended Audience :: Developers",
	"Intended Audience :: Education",
	// Can be used with https://github.com/xcv58/Custom-JavaScript-for-Websites-2
	// This snippet is released under the terms of the CC0 license: https://creativecommons.org/publicdomain/zero/1.0/deed.en

	// Updated by sillysaurusx (v0.1.0):
	// - bells stay till you mouseover each HN comment
	// - use a constant localStorage key
	// future plans:
	// - don't bell your own comments
	// - a button to un-bell the current page (or comment subtree)
	~/ml/shawwn-gpt-2$ inspect-checkpoint models/1558M/model.ckpt
	+ exec python3 -m tensorflow.python.tools.inspect_checkpoint --file_name=models/1558M/model.ckpt
	Init Plugin
	Init Graph Optimizer
	Init Kernel
	model/h0/attn/c_attn/b (DT_FLOAT) [4800]
	model/h0/attn/c_attn/w (DT_FLOAT) [1,1600,4800]
	model/h0/attn/c_proj/b (DT_FLOAT) [1600]
	model/h0/attn/c_proj/w (DT_FLOAT) [1,1600,1600]
	model/h0/ln_1/b (DT_FLOAT) [1600]
	— gwern is pleased he kept meticulous notes while setting up the original box so it's barely like 30 minutes to get the entire thing installed, formatted, ssh key-based logins working, basic perf optimizations like disabling reserved root blocks done etc. now just have to wait for the sycing from the old server to the new server, which should be fairly fast - they don't seem to be in the same
	20:39:03
	— gwern datacenter, but it'll still be quick
	20:42:18 <shawwwn> Shawn Presser
	gwern: make me an account
	20:42:31
	Also give me your notes
	20:42:40
	Decided to use notion for that
	20:42:46