Michael J Clark wassname

## twohot.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                wassname
                / twohot.md
            
            
              Last active
              January 14, 2024 02:09
            
              
                two-hot encoding notes
              
          
    What is two-hot encoding?

Description

Two hot encoding was introduced in 2017 in "Marc G Bellemare et all "A distributional perspective on
reinforcement learning" but the clearest description is in the 2020 paper "Dreamer-v3" by Danijar Hafner et al.) where it is used for reward and value distributions.

two-hot encoding is a generalization of onehot encoding to continuous values. It produces a vector of length |B| where all elements are 0 except for the two entries closest to the encoded continuous number, at positions k and k + 1. These two entries sum up to 1, with more weight given to the entry that is closer to the encoded number

Code samples


## torch_scalar.py
"""
how to wrap a scikit-learn scalar like RobustScaler for pytorch
"""
import torch
import numpy as np
from einops import rearrange
from sklearn.preprocessing import StandardScaler, RobustScaler

class TorchRobustScaler(RobustScaler):

## style_df.py
"""
you cannot display, you need to specify html
- see also https://pandas.pydata.org/docs/user_guide/style.html#Builtin-Styles
"""
import pandas as pd
from IPython.display import display, HTML

df = pd.DataFrame({
    "strings": ["Adam", "Mike"],
    "ints": [1, 3],

## argparse_in_jupyter.py
"""
sometimes you want to run or adapt a cli script from jupyter, here a decent way to do it
"""

argvs = """
--rank 16
--context=128
--vae_context=64
"""
argvs = argvs.replace('\n', ' ').strip()

## gpt4v_on_public_eng_docs.ipynb

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              1 star
            
          
                wassname
                / gpt4v_on_public_eng_docs.ipynb
            
            
              Created
              November 7, 2023 00:29
            
              
                gpt4v on public domain engineering docs
              
          
      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## emojis.json
{
  "🦆": {
    "tags": [
      "Waterfowl",
      "Bird",
      "Quack"
    ],
    "usage": [
      "🦆🌊: swimming duck",
      "🦆🍞: feeding ducks",

## STOP_DOING_MATH.md

      
              3 files
            
          
              0 forks
            
          
              1 comment
            
          
              0 stars
            
          
                wassname
                / STOP_DOING_MATH.md
            
            
              Created
              October 21, 2023 01:34
            
              
                STOP DOING MATH (in markdown and text since I couldn't find it anywhere on the web)
              
          
    STOP DOING MATH

NUMBERS WERE NOT SUPPOSED TO BE GIVEN NAMES
YEARS OF COUNTING yet NO REAL-WORLD USE FOUND for going higher than your FINGERS
Wanted to go higher anyway for a laugh? We had a tool for that: It was called "GUESSING"
"Yes please give me ZERO of something. Please give me INFINITE of it" - Statements dreamed up by the utterly Deranged

LOOK at what Mathematicians have been demanding your Respect for all this time, with all the calculators & abacus we built for them

  
## split_by_token.py
"""
When splitting text for Language Models, aim for two properties:

 - Limit tokens to a maximum size (e.g., 400)
 - Use natural boundaries for splits (e.g. ".")

Many splitters don't enforce a token size limit, causing errors like "device assert" or "out of memory." Others focus on character length rather than token length. To address these issues:

- Use RecursiveCharacterTextSplitter from the langchain library
- Set the last separator to an empty string '' to ensure there is always a splitting point, thus maintaining token limits

## cuda_11.8_installation_on_Ubuntu_22.04
#!/bin/bash

### steps ####
# verify the system has a cuda-capable gpu
# download and install the nvidia cuda toolkit and cudnn
# setup environmental variables
# verify the installation
###

### to verify your gpu is cuda enable check

## lightning_start.py
"""
This is a template for starting with pytorch lightning, it includes many extra things because it's easier to delete than reinvent.

It is written for these versions:
- lightning==2.0.2
- pytorch-optimizer==2.8.0
"""

import torch
import torch.nn as nn
	"""
	how to wrap a scikit-learn scalar like RobustScaler for pytorch
	"""
	import torch
	import numpy as np
	from einops import rearrange
	from sklearn.preprocessing import StandardScaler, RobustScaler

	class TorchRobustScaler(RobustScaler):
	"""
	you cannot display, you need to specify html
	- see also https://pandas.pydata.org/docs/user_guide/style.html#Builtin-Styles
	"""
	import pandas as pd
	from IPython.display import display, HTML

	df = pd.DataFrame({
	"strings": ["Adam", "Mike"],
	"ints": [1, 3],
	"""
	sometimes you want to run or adapt a cli script from jupyter, here a decent way to do it
	"""

	argvs = """
	--rank 16
	--context=128
	--vae_context=64
	"""
	argvs = argvs.replace('\n', ' ').strip()
	{
	"🦆": {
	"tags": [
	"Waterfowl",
	"Bird",
	"Quack"
	],
	"usage": [
	"🦆🌊: swimming duck",
	"🦆🍞: feeding ducks",
	"""
	When splitting text for Language Models, aim for two properties:

	- Limit tokens to a maximum size (e.g., 400)
	- Use natural boundaries for splits (e.g. ".")

	Many splitters don't enforce a token size limit, causing errors like "device assert" or "out of memory." Others focus on character length rather than token length. To address these issues:

	- Use RecursiveCharacterTextSplitter from the langchain library
	- Set the last separator to an empty string '' to ensure there is always a splitting point, thus maintaining token limits
	#!/bin/bash

	### steps ####
	# verify the system has a cuda-capable gpu
	# download and install the nvidia cuda toolkit and cudnn
	# setup environmental variables
	# verify the installation
	###

	### to verify your gpu is cuda enable check
	"""
	This is a template for starting with pytorch lightning, it includes many extra things because it's easier to delete than reinvent.

	It is written for these versions:
	- lightning==2.0.2
	- pytorch-optimizer==2.8.0
	"""

	import torch
	import torch.nn as nn