Skip to content

Instantly share code, notes, and snippets.

View rain-1's full-sized avatar
☂️
Umbrella

rain1 rain-1

☂️
Umbrella
View GitHub Profile
@rain-1
rain-1 / Scheme WASM Tail Call Situation.md
Created November 8, 2023 14:24
Scheme WASM Tail Call Situation

This spec seems to have gotten in thanks to work by apignotti, https://hn.algolia.com/?q=WebAssembly+tail+calls

There is also a very interesting project for generalized effect handlers that may build on top of this platform https://wasmfx.dev/community/

Great news for schemers with web browsers.

@rain-1
rain-1 / 000-crossword-solving.md
Last active May 31, 2023 16:19
crossword-solving-with-gpt.md

Solving crosswords with GPT

This is my research report. I've included a lot of the code and chat interactions for people to read through if interested. I worked on this crossword https://www.theguardian.com/crosswords/quick/16553

I had a vision for a GPT powered crossword solver. My idea is that it would do a tree search over GPT generated guesses that would include the knowns so far, like:

givens

I didn't end up doing that because ChatGPT and GPT-4 are terrible at questions involving the length of words, or guessing words that contain specific letters at specific locations. It can sometimes do them but usually fails. I think this is because it's token based. I am curious whether a character based LLM would be better at such tasks.

@rain-1
rain-1 / llama-home.md
Last active February 22, 2024 05:52
How to run Llama 13B with a 6GB graphics card

This worked on 14/May/23. The instructions will probably require updating in the future.

llama is a text prediction model similar to GPT-2, and the version of GPT-3 that has not been fine tuned yet. It is also possible to run fine tuned versions (like alpaca or vicuna with this. I think. Those versions are more focused on answering questions)

Note: I have been told that this does not support multiple GPUs. It can only use a single GPU.

It is possible to run LLama 13B with a 6GB graphics card now! (e.g. a RTX 2060). Thanks to the amazing work involved in llama.cpp. The latest change is CUDA/cuBLAS which allows you pick an arbitrary number of the transformer layers to be run on the GPU. This is perfect for low VRAM.

  • Clone llama.cpp from git, I am on commit 08737ef720f0510c7ec2aa84d7f70c691073c35d.
@rain-1
rain-1 / Prompt Injection and AutoGPT.md
Last active September 11, 2023 11:12
Prompt Injection and AutoGPT

Does prompt injection matter to AutoGPT?

Executive summary: If you use AutoGPT, you need to be aware of prompt injection. This is a serious problem that can cause your AutoGPT agent to perform unexpected and unwanted tasks. Unfortunately, there isn't a perfect solution to this problem available yet.

Prompt injection can derail agents

If you set up an AutoGPT agent to perform task A, a prompt injection could 'derail' it into performing task B instead. Task B could be anything. Even something unwanted like deleting your personal files or sending all your bitcoins to some crooks wallet.

Docker helps limit the file system access that agents have. Measures like this are extremely useful. It's important to note that the agent can still be derailed.

@rain-1
rain-1 / bard-lstm.md
Last active May 7, 2023 22:09
bard-lstm.md

This is a report on my experience pair programming with Bard on a neural network task that challenged it to its current limits.

Bard now has the ability to program, or put another way Google has removed the gating that blocked it from trying.

All the code in this article is basically 99% produced by Bard. I either prompted it to refactor things or I just tweaked one line or two lines of every 100.

Note: I used gpt-4 a little bit too, for the training part, but this is mostly Bard.

XOR

@rain-1
rain-1 / WorLLMs.md
Last active January 24, 2024 09:05
WorLLMs

Could an LLM end up being the core part of a dangerous computer worm?

How would we neutralize such a thing if this happened?

Some virus and worm background

There is a hilarious story from https://users.cs.utah.edu/~elb/folklore/xerox.txt about an early computer virus called robin hood and friar tuck. This was basically just two programs running on a UNIX system that would look out for each other and reboot the other process if it was killed. It's interesting to note that since computer programs run thousands of times faster than humans, a human can't type kill -9 robinhood then type kill -9 friartuck in time. The computer is faster so it always wins if you try this. To defeat this you need to take a different approach than speed.

Google translate

in:

Give a critique that attempts to refute Searle's chinese room. Include something from derrida in your response.

out:

给出一个试图反驳塞尔中文房间的批评。在您的回复中包含德里达的内容。

@rain-1
rain-1 / GPT-4 Reverse Turing Test.md
Last active February 24, 2024 18:31
GPT-4 Reverse Turing Test

The reverse turing test

I asked GPT-4 to come up with 10 questions to determine if the answerer was AI or human.

I provided my own answers for these questions and I also asked ChatGPT to answer them.

The result is that GPT-4 was able to correctly differentiate between AI and Human.

@rain-1
rain-1 / LLM.md
Last active March 16, 2024 15:14
LLM Introduction: Learn Language Models

Purpose

Bootstrap knowledge of LLMs ASAP. With a bias/focus to GPT.

Avoid being a link dump. Try to provide only valuable well tuned information.

Prelude

Neural network links before starting with transformers.

@rain-1
rain-1 / 0-MNIST.md
Last active March 25, 2023 03:41
MNIST digit classification

MNIST digit recognition

The pytorch (neural network library) examples include a script to try out the training process for MNIST digit recognition data set: https://github.com/pytorch/examples/tree/main/mnist

This builds up a convolutional neural network that takes one of these pictures and processes it down to 10 neurons. The training process uses two sets of labelled data (examples of pictures of digits and which of the 10 possible digits they are): One training set and one testing set. The training set is used to manipulate all of the "weights" inside the neural network by moving in the (very high dimensional) direction of fastest descent, aiming to get the output neurons to produce the intended label given the input picture. The testing set is used as a metric to say how well the neural network is doing.

I ran this, creating mnist_cnn.pt with 99% accuracy on the test data set.

Then I wanted to see if it worked, so I drew images of all 10 digits. There was no way to try this out so I wrote the attach