Skip to content

Instantly share code, notes, and snippets.

View yashbonde's full-sized avatar
👽
Up there!

Yash Bonde yashbonde

👽
Up there!
View GitHub Profile

File System UX on Postgres

What are the properties of file system UX and how we implemented it in postgres.

Before we begin, here's the table SQL

CREATE TABLE documents (
	id text NOT NULL,
	collection_id text NULL,

NimbleBox Apprenticeship Open Challenges

Hi there, thank you for your interest in NimbleBox.

We built ChatNBX and ChainFury

We are looking for talented coders who sit at the intersection of ML and SDE. There are three positions:

  • ML Engineer: Systems level thinker, eats servers for lunch
  • ML Researcher: Model whisperer, any model, anytime, any where
  • Front-End engineer: UI Wizard, 'nuff said

Why the fury?

(ENG-01) The first engineering blog.

ChainFury started as a weekend hackathon but since then has developed into a much bigger project (dare I say, one of the last systems). The core idea behind it being the rapid development (with chains), deployment (with embeddable chatbot UI) and gathering feedback for the performance. Initially it was built with langflow as inspiration which was in turn built on top of langchain.

Chandrani's written a great starting blog on ChainFury.

Success

@yashbonde
yashbonde / jupyterKernelRunner.go
Last active December 30, 2022 00:37
A simple example of a golang client for Jupyter Kernel.
// This is a copy of saturn/runner.py translated to Go
package main
import (
"encoding/json"
"fmt"
"io/ioutil"
"os"
@yashbonde
yashbonde / guesslang.py
Last active April 18, 2022 09:29
[Guesslang](https://github.com/yoeo/guesslang/) is cool, but there's to many bells and whistles in it. Here is a script that makes it easy.
import os
import sys
import json
import logging
import subprocess
from pathlib import Path
try:
import tensorflow as tf
except:
@yashbonde
yashbonde / hk.py
Last active February 20, 2022 12:40
#!/usr/bin/env python
# Copyright 2020 DeepMind Technologies Limited. All Rights Reserved.
# Licensed under the Apache License, Version 2.0
# Modifications copyright Yash Bonde (C) 2021 Nimblebox.ai, Inc.
# This file is peak Google! <3
# How far can you push Python before it's just too hard?
from typing import Any, Dict, Iterable, List, Tuple, Optional
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@yashbonde
yashbonde / gpt.py
Last active August 14, 2021 12:13
Using GPT right now is very tedious as you have to keep calling `model.generate()` method. This code simplifies this by making __call__ first class and store results in a searchable history!
# wrapper for using GPT generation first-class
# MIT - License, 2021, Yash Bonde
import os
import torch
import pickle
import hashlib
import warnings
import numpy as np
from time import time
In this quick script we are trying to solve sharding problem:
often in very large datasets there is no way to tokenize everything and store
them. Considering the CLM datasets we have a fixed dataset where each row
has dynamic number of tokens. A dummy looks like follows:
j n sequence (w/o EOT = 42)
[0] [15] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13],
[1] [13] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
[2] [11] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[3] [13] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
# @yashbonde
#
# In this quick script we are trying to solve sharding problem:
# often in very large datasets there is no way to tokenize everything and store
# them. Considering the CLM datasets we have a fixed dataset where each row
# has dynamic number of tokens. A dummy looks like follows:
#
# j n sequence (w/o EOT = 42)
# [0] [15] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13],
# [1] [13] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],