On
Last active May 2, 2024 13:39
On Llamafile

On Llamafile not making sense

The LLamafile project doesn't make sense.

The claim is that it is "bringing LLMs to the people", but you could already run an LLM - which is a large binary file containing lots of floating point numbers - by using llama.cpp.

Llamafile joins a compiled binary program to run LLMs with a weights binary into a single file. This isn't a useful goal. you could simply distribute a zip containing an .exe and a weights file together. Or better still: Decouple the program that runs these chatbots from the chatbot weights.

Imagine if PNG files were also an executable that could pop open a window that displays a PNG on your computer. There is a reason we don't do this: It's not good engineering.

Scheme WASM Tail Call
Created November 8, 2023 14:24
Scheme WASM Tail Call Situation

This spec seems to have gotten in thanks to work by apignotti,

There is also a very interesting project for generalized effect handlers that may build on top of this platform

Great news for schemers with web browsers.


Last active April 23, 2024 05:25

Solving crosswords with GPT

This is my research report. I've included a lot of the code and chat interactions for people to read through if interested. I worked on this crossword

I had a vision for a GPT powered crossword solver. My idea is that it would do a tree search over GPT generated guesses that would include the knowns so far, like:


I didn't end up doing that because ChatGPT and GPT-4 are terrible at questions involving the length of words, or guessing words that contain specific letters at specific locations. It can sometimes do them but usually fails. I think this is because it's token based. I am curious whether a character based LLM would be better at such tasks.


Last active May 29, 2024 13:27
How to run Llama 13B with a 6GB graphics card

This worked on 14/May/23. The instructions will probably require updating in the future.

llama is a text prediction model similar to GPT-2, and the version of GPT-3 that has not been fine tuned yet. It is also possible to run fine tuned versions (like alpaca or vicuna with this. I think. Those versions are more focused on answering questions)

Note: I have been told that this does not support multiple GPUs. It can only use a single GPU.

It is possible to run LLama 13B with a 6GB graphics card now! (e.g. a RTX 2060). Thanks to the amazing work involved in llama.cpp. The latest change is CUDA/cuBLAS which allows you pick an arbitrary number of the transformer layers to be run on the GPU. This is perfect for low VRAM.

  • Clone llama.cpp from git, I am on commit 08737ef720f0510c7ec2aa84d7f70c691073c35d.
Prompt Injection and
Last active September 11, 2023 11:12
Prompt Injection and AutoGPT

Does prompt injection matter to AutoGPT?

Executive summary: If you use AutoGPT, you need to be aware of prompt injection. This is a serious problem that can cause your AutoGPT agent to perform unexpected and unwanted tasks. Unfortunately, there isn't a perfect solution to this problem available yet.

Prompt injection can derail agents

If you set up an AutoGPT agent to perform task A, a prompt injection could 'derail' it into performing task B instead. Task B could be anything. Even something unwanted like deleting your personal files or sending all your bitcoins to some crooks wallet.

Docker helps limit the file system access that agents have. Measures like this are extremely useful. It's important to note that the agent can still be derailed.


Last active May 7, 2023 22:09

This is a report on my experience pair programming with Bard on a neural network task that challenged it to its current limits.

Bard now has the ability to program, or put another way Google has removed the gating that blocked it from trying.

All the code in this article is basically 99% produced by Bard. I either prompted it to refactor things or I just tweaked one line or two lines of every 100.

Note: I used gpt-4 a little bit too, for the training part, but this is mostly Bard.



Last active January 24, 2024 09:05

Could an LLM end up being the core part of a dangerous computer worm?

How would we neutralize such a thing if this happened?

Some virus and worm background

There is a hilarious story from about an early computer virus called robin hood and friar tuck. This was basically just two programs running on a UNIX system that would look out for each other and reboot the other process if it was killed. It's interesting to note that since computer programs run thousands of times faster than humans, a human can't type kill -9 robinhood then type kill -9 friartuck in time. The computer is faster so it always wins if you try this. To defeat this you need to take a different approach than speed.

Google translate


Give a critique that attempts to refute Searle's chinese room. Include something from derrida in your response.



GPT-4 Reverse Turing
Last active May 28, 2024 17:40
GPT-4 Reverse Turing Test

The reverse turing test

I asked GPT-4 to come up with 10 questions to determine if the answerer was AI or human.

I provided my own answers for these questions and I also asked ChatGPT to answer them.

The result is that GPT-4 was able to correctly differentiate between AI and Human.


Last active June 10, 2024 10:55
LLM Introduction: Learn Language Models


Bootstrap knowledge of LLMs ASAP. With a bias/focus to GPT.

Avoid being a link dump. Try to provide only valuable well tuned information.


Neural network links before starting with transformers.