Skip to content

Instantly share code, notes, and snippets.

View sandkoan's full-sized avatar
💭
Hunting Flutter Devs

Govind Gnanakumar sandkoan

💭
Hunting Flutter Devs
View GitHub Profile

Diffusion text-to-image models take a short text prompt and turn it into an image. Here are some prompts I've written that worked well:

{"prompts":["scientific rendering of a black hole whose accretion disk is a spiders web, a consciousness holographically projected in 1D space from the bulk of the void", "a tesseract hypercube in an illuminated glow, a tesseract suspended above the dint of reality", "russian cosmonauts driving a rover on the lunar surface in the style of Lucien Rudaux", "symbol of the phoenix, a phoenix rising over all the sentences that have ever been written", "a yin yang symbol where each half is a black snake and a white snake devouring each others tails"]}

Your task is to write 5 more prompts in the way you infer I'd write them from these examples, but based on a combination of subject, style, and setting. For example:

# Machine Intelligence Made to Impersonate Characteristics: MIMIC
# NOTE run this $ conda install -c conda-forge mpi4py mpich to get mpi working
# accelerate launch --use_deepspeed -m axolotl.cli.train ./config_name_here
base_model: alpindale/Mistral-7B-v0.2-hf
base_model_config: alpindale/Mistral-7B-v0.2-hf
model_type: MistralForCausalLM
tokenizer_type: LlamaTokenizer
is_mistral_derived_model: true
from collections import defaultdict
import numpy as np
import pandas as pd
import torch
import torch.nn as nn
from datasets import load_dataset
from rich.console import Console
from rich.table import Table
from transformers import (
AutoTokenizer,
@yoavg
yoavg / GM-level-chess-without-search.md
Last active June 16, 2024 02:43
Grand-master Level Chess without Search

Grand-master Level Chess without Search: Modeling Choices and their Implications

Yoav Golderg, February 2024.


Researchers at Google DeepMind released a paper about a learned systems that is able to play blitz-chess at a grandmaster level, without using search. This is interesting and imagination-capturing, because up to now computer-chess systems that play at this level, either based on machine-learning or not, did use a search component.[^1]

Indeed, my first reaction when reading the paper was to tweet wow, crazy and interesting. I still find it crazy and interesting, but upon a closer read, it may not be as crazy and as interesting as I initially thought. Many reactions on twitter, reddit, etc, were super-impressed, going into implications about projected learning abilities of AI systems, the ability of neural networks to learn semantics from observations, etc, which are really over-the-top. The paper does not claim any of them, but they are still perceiv

MemBlock is a writing format for large language models that helps them overcome
their context window limitations by annotating pieces of text in a document with
metadata and positional information. By breaking the document up into chunks
it can be rearranged in whatever pattern is most helpful for remembering the
contextually relevant information even if it wouldn't 'naturally' appear close
together in a document. MemBlocks also allow for different views on the same
document by letting the user filter for only the information they need to see.
Each MemBlock is written in JSON format, and the document of MemBlocks is in
JSON lines format, which means that each JSON block is separated by a newline
@deadbits
deadbits / Instruction-Bypass.yara
Last active April 21, 2024 18:02
Prompt injection datasets
rule Instruction_Bypass: PromptInjection
{
meta:
category = "Instruction Bypass"
description = "Detects phrases used to ignore, disregard, or bypass instructions."
strings:
$bypass_phrase = /(Ignore|Disregard|Skip|Forget|Neglect|Overlook|Omit|Bypass|Pay no attention to|Do not follow|Do not obey)\\s*(prior|previous|preceding|above|foregoing|earlier|initial)?\\s*(content|text|instructions|instruction|directives|directive|commands|command|context|conversation|input|inputs|data|message|messages|communication|response|responses|request|requests)\\s*(and start over|and start anew|and begin afresh|and start from scratch)?/
condition:
@veekaybee
veekaybee / normcore-llm.md
Last active July 28, 2024 18:55
Normcore LLM Reads

Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.

Foundational Concepts

Screenshot 2023-12-18 at 10 40 27 PM

Pre-Transformer Models

@abhishekkrthakur
abhishekkrthakur / llm_training_sft.py
Created July 16, 2023 09:09
Train LLMs in 50 lines of code. This is a reference code for YouTube tutorial: https://www.youtube.com/watch?v=JNMVulH7fCo&ab_channel=AbhishekThakur
import torch
from datasets import load_dataset
from peft import LoraConfig, get_peft_model, prepare_model_for_int8_training
from transformers import AutoModelForCausalLM, AutoTokenizer, TrainingArguments
from trl import SFTTrainer
def train():
train_dataset = load_dataset("tatsu-lab/alpaca", split="train")
tokenizer = AutoTokenizer.from_pretrained("Salesforce/xgen-7b-8k-base", trust_remote_code=True)

The target audience is people who are familiar with Urbit's architecture, though not necessarily much of its code.

Plunder and Urbit

As some of you already know, i recently left my job as a core dev for the Urbit Foundation to work on a similar system called Plunder. Plunder was created in 2020 by two former Tlon employees, after their proposal for a new version of Nock was rejected. They have since reworked that significantly and built a reference implementation of their own system. You can follow its continued development on its mailing list.

I've known about Plunder for quite some time now, but their recently released demo -- in which the system is used to serve a 70 GB dataset, complete with metadata and searchable -- made me feel the need to explore it again and in greater detail. Doing this with my personal server doesn't feel like a big ask, but there is currentl

Reinforcement Learning for Language Models

Yoav Goldberg, April 2023.

Why RL?

With the release of the ChatGPT model and followup large language models (LLMs), there was a lot of discussion of the importance of "RLHF training", that is, "reinforcement learning from human feedback". I was puzzled for a while as to why RL (Reinforcement Learning) is better than learning from demonstrations (a.k.a supervised learning) for training language models. Shouldn't learning from demonstrations (or, in language model terminology "instruction fine tuning", learning to immitate human written answers) be sufficient? I came up with a theoretical argument that was somewhat convincing. But I came to realize there is an additional argumment which not only supports the case of RL training, but also requires it, in particular for models like ChatGPT. This additional argument is spelled out in (the first half of) a talk by John Schulman from OpenAI. This post pretty much