Skip to content

Instantly share code, notes, and snippets.

View cedrickchee's full-sized avatar
⚡ 🦀 🐿️ 🐘 🐳 ⬡ ⚛️ 🚢 🚀 🦄 🍵

Cedric Chee cedrickchee

⚡ 🦀 🐿️ 🐘 🐳 ⬡ ⚛️ 🚢 🚀 🦄 🍵
View GitHub Profile

What Makes a True AI Coding Assistant?

What is an AI Coding Assistant?

If the coding assistant can't run ITERATIVE CRUD on ALL of your code, it's not a True AI Coding Assistant (TACA)

Standards for True AI Coding Assistants

  1. Must work on existing codebases
  2. Must have a file context mechanism
cedrickchee /
Created March 31, 2024 16:33 — forked from thesamesam/
xz-utils backdoor situation

FAQ on the xz-utils backdoor


On March 29th, 2024, a backdoor was discovered in xz-utils, a suite of software that gives developers lossless compression. This package is commonly used for compressing release tarballs, software packages, kernel images, and initramfs images. It is very widely distributed, statistically your average Linux or macOS system will have it installed for

cedrickchee /
Created July 12, 2023 05:22 — forked from rain-1/
How to run Llama 13B with a 6GB graphics card

This worked on 14/May/23. The instructions will probably require updating in the future.

llama is a text prediction model similar to GPT-2, and the version of GPT-3 that has not been fine tuned yet. It is also possible to run fine tuned versions (like alpaca or vicuna with this. I think. Those versions are more focused on answering questions)

Note: I have been told that this does not support multiple GPUs. It can only use a single GPU.

It is possible to run LLama 13B with a 6GB graphics card now! (e.g. a RTX 2060). Thanks to the amazing work involved in llama.cpp. The latest change is CUDA/cuBLAS which allows you pick an arbitrary number of the transformer layers to be run on the GPU. This is perfect for low VRAM.

  • Clone llama.cpp from git, I am on commit 08737ef720f0510c7ec2aa84d7f70c691073c35d.
cedrickchee / ai-plugin.json
Created March 25, 2023 14:22 — forked from danielgross/ai-plugin.json
ChatGPT Plugin for Twilio
"schema_version": "v1",
"name_for_model": "twilio",
"name_for_human": "Twilio Plugin",
"description_for_model": "Plugin for integrating the Twilio API to send SMS messages and make phone calls. Use it whenever a user wants to send a text message or make a call using their Twilio account.",
"description_for_human": "Send text messages and make phone calls with Twilio.",
"auth": {
"type": "user_http",
"authorization_type": "basic"
cedrickchee /
Created March 6, 2023 13:28 — forked from shawwn/
How I run 65B using my fork of llama at
mp=1; size=7B; # to run 7B
mp=8; size=65B; # to run 65B
for seed in $(randint 1000000)
export TARGET_FOLDER=~/ml/data/llama/LLaMA
time python3 -m --nproc_per_node $mp --ckpt_dir $TARGET_FOLDER/$size --tokenizer_path $TARGET_FOLDER/tokenizer.model --seed $seed --max_seq_len 2048 --max_gen_len 2048 --count 0 | tee -a ${size}_startrek.txt
cedrickchee /
Last active January 24, 2024 06:16 — forked from yoavg/
Fix typos and grammar of the original writing.

Some remarks on Large Language Models

Yoav Goldberg, January 2023

Audience: I assume you heard of ChatGPT, maybe played with it a little, and was impressed by it (or tried very hard not to be). And that you also heard that it is "a large language model". And maybe that it "solved natural language understanding". Here is a short personal perspective of my thoughts of this (and similar) models, and where we stand with respect to language understanding.


Around 2014-2017, right within the rise of neural-network based methods for NLP, I was giving a semi-academic-semi-popsci lecture, revolving around the story that achieving perfect language modeling is equivalent to being as intelligent as a human. Somewhere around the same time I was also asked in an academic panel "what would you do if you were given infinite compute and no need to worry about labor costs" to which I cockily responded "I would train a really huge language model, just to show that it doesn't solve everything!". We

cedrickchee /
Last active January 11, 2023 10:53 — forked from veekaybee/
Everything I understand about chatgpt

ChatGPT Resources


ChatGPT appeared like an explosion on all my social media timelines in early December 2022. While I keep up with machine learning as an industry, I wasn't focused so much on this particular corner, and all the screenshots seemed like they came out of nowehre. What was this model? How did the chat prompting work? What was the context of OpenAI doing this work and collecting my prompts for training data?

I decided to do a quick investigation. Here's all the information I've found so far. I'm aggregating and synthesizing it as I go, so it's currently changing pretty frequently.

Model Architecture

cedrickchee /
Created November 28, 2022 09:14 — forked from jamesmacwhite/
Easy way to convert MKV to MP4 with ffmpeg

Converting mkv to mp4 with ffmpeg

Essentially just copy the existing video and audio stream as is into a new container, no funny business!

The easiest way to "convert" MKV to MP4, is to copy the existing video and audio streams and place them into a new container. This avoids any encoding task and hence no quality will be lost, it is also a fairly quick process and requires very little CPU power. The main factor is disk read/write speed.

With ffmpeg this can be achieved with -c copy. Older examples may use -vcodec copy -acodec copy which does the same thing.

These examples assume ffmpeg is in your PATH. If not just substitute with the full path to your ffmpeg binary.

Single file conversion example

cedrickchee /
Created September 1, 2022 16:47 — forked from raphlinus/
Translation of into English

The not-too-clever programmer

This is a translation of into clear English. All props to the original author.


This is a collection of thoughts on software development, originally written by an pseudonymous author styling themselves the "grug brain developer," but then translated into clear English by Raph Levien.

I am not an extremely smart developer, but I have many years of experience and have learned some things, although still don't know everything.

cedrickchee / latency.markdown
Created December 3, 2021 16:14 — forked from hellerbarde/latency.markdown
Latency numbers every programmer should know

Latency numbers every programmer should know

L1 cache reference ......................... 0.5 ns
Branch mispredict ............................ 5 ns
L2 cache reference ........................... 7 ns
Mutex lock/unlock ........................... 25 ns
Main memory reference ...................... 100 ns             
Compress 1K bytes with Zippy ............. 3,000 ns  =   3 µs
Send 2K bytes over 1 Gbps network ....... 20,000 ns  =  20 µs
SSD random read ........................ 150,000 ns  = 150 µs

Read 1 MB sequentially from memory ..... 250,000 ns = 250 µs